by mikeyur on 2/27/14, 5:28 PM with 11 comments
by smartbear on 2/27/14, 6:54 PM
The main points in the article are factually incorrect, but some of it is in fact excellent feedback that we're going to act on.
First, on staging areas, it is incorrect that they are counted as duplicate content, because we force a "deny robots everything" robots.txt file on staging. Using the example from the article of our TorqueMag website: http://torque.staging.wpengine.com/robots.txt
It's true that some bots will ignore robots.txt, but all the major search engines that matter for SEO, which is the point of the article, do honor it. Some -- including Google! -- will scan it anyway, but it doesn't count for duplicate content. Matt Cutts has been extremely clear on this point, publicly.
Second, on duplicate content on the WP Engine domains (e.g. torque.wpengine.com), again what was stated is factually incorrect for Google but is a good point for some other search engines. Here's why:
Google maintains a set of root domains that they know are companies that do exactly what we and many other hosting companies do. Included in that list are WordPress.com, SquareSpace, and us. When they detect "duplicate content" on subdomains from that list, they know that's not actually duplicate content. You can see it in Google Search, but it's not counted against you.
We have had a dialog directly with Matt Cutts on this point, so this is not conjecture, but fact.
However, the suggestion from the article that it's better to 301 that domain is still also very valid. Also, not all search engines are aware of this scenario, and thus one of the take-aways we have from this article is that we should auto-force robots.txt for the XYZ.wpengine.com domains just as we do for the staging domains, so that other search engines won't be confused.
So in the end, we came away with a good idea of how to improve. It's a shame the point had to be made in the manner that it was, and intermixed with FUD.
by eli on 2/27/14, 5:49 PM
The SEO issues identified seem incredibly minor and the "leaking personal information" apparently refers to the fact that if you use WP-Engine to publish a prerelease version of your site on the internet... that means it's published on the internet.
by evunveot on 2/27/14, 8:41 PM
I'm curious why the OP thinks that matters. The only thing I can think of is that you don't necessarily benefit if another site links to or hotlinks a particular image of yours (though I wouldn't be surprised if Google recognizes where the image originally appeared and does give credit). And it seems like any minimally reputable site that wanted to use your image would download and rehost it themselves.
> The problem is you have effectively orphaned all of your images to a worthless subdomain on NetDNA-CDN.com, no bueno if you’re trying to get some image traffic from Google.
How does that prevent you from getting traffic? If an image on your page comes up in image search, Google still shows your site's domain and links to your page for everything except the "View Image" button, and if someone clicks that, they just want to download your image and presumably don't care about your actual site at all.
by tribe2012 on 2/27/14, 5:56 PM
Just my 2 cents. I'm not a fan of WPEngine anyway.
by jaredmck on 2/27/14, 5:50 PM
At least if you're gonna have multiple sites, this WPEngine is expensive stuff.
by henryaj on 2/27/14, 5:52 PM
by bhartzer on 2/27/14, 5:52 PM