from Hacker News

How WPEngine Is Failboating Your SEO and Leaking Your Information

by mikeyur on 2/27/14, 5:28 PM with 11 comments

  • by smartbear on 2/27/14, 6:54 PM

    This is the Founder of WP Engine.

    The main points in the article are factually incorrect, but some of it is in fact excellent feedback that we're going to act on.

    First, on staging areas, it is incorrect that they are counted as duplicate content, because we force a "deny robots everything" robots.txt file on staging. Using the example from the article of our TorqueMag website: http://torque.staging.wpengine.com/robots.txt

    It's true that some bots will ignore robots.txt, but all the major search engines that matter for SEO, which is the point of the article, do honor it. Some -- including Google! -- will scan it anyway, but it doesn't count for duplicate content. Matt Cutts has been extremely clear on this point, publicly.

    Second, on duplicate content on the WP Engine domains (e.g. torque.wpengine.com), again what was stated is factually incorrect for Google but is a good point for some other search engines. Here's why:

    Google maintains a set of root domains that they know are companies that do exactly what we and many other hosting companies do. Included in that list are WordPress.com, SquareSpace, and us. When they detect "duplicate content" on subdomains from that list, they know that's not actually duplicate content. You can see it in Google Search, but it's not counted against you.

    We have had a dialog directly with Matt Cutts on this point, so this is not conjecture, but fact.

    However, the suggestion from the article that it's better to 301 that domain is still also very valid. Also, not all search engines are aware of this scenario, and thus one of the take-aways we have from this article is that we should auto-force robots.txt for the XYZ.wpengine.com domains just as we do for the staging domains, so that other search engines won't be confused.

    So in the end, we came away with a good idea of how to improve. It's a shame the point had to be made in the manner that it was, and intermixed with FUD.

  • by eli on 2/27/14, 5:49 PM

    This seems like someone with an axe to grind.

    The SEO issues identified seem incredibly minor and the "leaking personal information" apparently refers to the fact that if you use WP-Engine to publish a prerelease version of your site on the internet... that means it's published on the internet.

  • by evunveot on 2/27/14, 8:41 PM

    > To Google, these images aren’t on your site, they are on - wpengine.netdna-cdn.com

    I'm curious why the OP thinks that matters. The only thing I can think of is that you don't necessarily benefit if another site links to or hotlinks a particular image of yours (though I wouldn't be surprised if Google recognizes where the image originally appeared and does give credit). And it seems like any minimally reputable site that wanted to use your image would download and rehost it themselves.

    > The problem is you have effectively orphaned all of your images to a worthless subdomain on NetDNA-CDN.com, no bueno if you’re trying to get some image traffic from Google.

    How does that prevent you from getting traffic? If an image on your page comes up in image search, Google still shows your site's domain and links to your page for everything except the "View Image" button, and if someone clicks that, they just want to download your image and presumably don't care about your actual site at all.

  • by tribe2012 on 2/27/14, 5:56 PM

    I've spoken to a former Googler that was on the search team, and he said the duplicate content isn't as big of a deal as one would expect. They realize when it is and isn't an issue. For example, blog reposts do not hurt you (in fact they may help you) even though they show up as duplicate content.

    Just my 2 cents. I'm not a fan of WPEngine anyway.

  • by jaredmck on 2/27/14, 5:50 PM

    I get that it's easier to set up WPEngine, but this seems like one of those cases where you might be better off learning what to do - just set up a self-hosted VPS with nginx + varnish?

    At least if you're gonna have multiple sites, this WPEngine is expensive stuff.

  • by henryaj on 2/27/14, 5:52 PM

  • by bhartzer on 2/27/14, 5:52 PM

    >> 504 Gateway Time-out -- Your SEO won't work if your site doesn't respond.