Ad

Why Google Webmaster Tools Don't See The Static Version Of My Site But Instead The Template For The Dynamic One?

I have added the spiderable package package to my Meteor app, and the html version of the page is returned when making requests with ?_escaped_fragment_= in the url, but I'm unable to get Google to crawl the site.

Details

When using Fetch as Google in Google Webmaster Tools and requesting the root page "http://example.com/" the page return is the javascript version; some thing like:

HTTP/1.1 200 OK
content-type: text/html; charset=utf-8
date: Fri, 30 Nov 2012 05:39:36 GMT
connection: Keep-alive
transfer-encoding: chunked

<!DOCTYPE html>
<html>
  <head>
    <link rel="stylesheet" target="_blank" rel="nofollow noreferrer" href="/e83157bdc4ff057fa3a20b82af4c11b4ebe776e7.css">
    <script type="text/javascript">
      __meteor_runtime_config__ = {"ROOT_URL":"http://www.example.com","DEFAULT_DDP_ENDPOINT":"https://www-example-com-ddp.meteor.com/"};
    </script>
    <script type="text/javascript" src="/13cf3d21ce1c4a88407ca5f3c250f186ab1738f9.js"></script>
    <meta name="fragment" content="!">
    <title>example.com</title>
  </head>
<body>
</body>
</html>

If instead, I request http://example.com/?_escaped_fragment_= the html version is returned:

HTTP/1.1 200 OK
content-type: text/html; charset=UTF-8
date: Wed, 05 Dec 2012 02:44:09 GMT
connection: Keep-alive
transfer-encoding: chunked

<!DOCTYPE html>
<html>
  <head>
    <link rel="stylesheet" target="_blank" rel="nofollow noreferrer" href="/e83157bdc4ff057fa3a20b82af4c11b4ebe776e7.css">
    <title>example.com</title>
    <meta name="viewport" content="initial-scale=1.0">
  </head>
  <body>
    <ul>
      <li><a target="_blank" rel="nofollow noreferrer" href="/">Home</a></li>
      <li><a target="_blank" rel="nofollow noreferrer" href="/one">One</a></li>
      <li><a target="_blank" rel="nofollow noreferrer" href="/two">Two</a></li>
    </ul>
  </body>
</html>

Questions

  • How do you tell Google to add the ?_escaped_fragment_= to the url, so that it renders the html version?

  • Will Google still add the ?_escaped_fragment_= to the url, if the urls do not have hashbangs (!#)? i.e. /home, /products/1 instead of /!#home, /!#products/1?

  • How do you make Google follow the linked pages? And append the ?_escaped_fragment_=? All of the js version of the page have <meta name="fragment" content="!"> in the header. I assumed that was all that was required.

It seems that the simplest solution would be update the update the spiderable package to return the html version to Google Bot, instead of requiring ?_escaped_fragment_=, but if this is working for others, I'm curious, as to what I'm doing wrong.

Additional Info

Meteor's spiderable package is a temporary solution to allow web search engines to index Meteor applications.

According to the source it does a few things:

  1. It adds the following tag to the head section of js version of the page:

    <head><meta name="fragment" content="!"></head>

  2. Using PhantomJS it parses the javascript application and returns an html version when either of the following conditions are met:

    a. The requesting user agent is "facebookexternalhit"

    b. The requested url contains the string ?_escaped_fragment_=

Ad

Answer

I believe this to be a "Google Webmaster Tools" bug.

It seems that Google is indeed crawling the site -- the pages are showing up in Google results. Yet, Google Webmaster tools still list total indexed pages as 1. Bing still isn't crawling the page, however.

EDIT: It Google Webmaster Tools the pages are listed as

Not selected: Pages that are not indexed because they are substantially similar to other pages, or that have been redirected to another URL. More information.

EDIT2: In response to Jonatan's question:

Will Google still add the ?_escaped_fragment_= to the url, if the urls do not have hashbangs (!#)?

Yes. My application does not use hashbangs (!#) in the urls. And Google bot still appends ?_escaped_fragment_= when crawling. Here's an example of the logs:

INFO HIT /url/2/01 66.249.72.42
INFO HIT /url/2/01?_escaped_fragment_= 66.249.72.142
INFO HIT /url/2/01 108.162.222.82
INFO HIT /url/2/01?_escaped_fragment_= 108.162.222.82
INFO HIT /url/2/05 108.162.222.82
INFO HIT /url/2/05?_escaped_fragment_= 108.162.222.214

It appear that Google bot will try the url with and without the ?_escaped_fragment_=

Ad
source: stackoverflow.com
Ad