What's stopping Google from indexing single page Javascript applications?

Just recently, through our local dev community in Croatia, I got involved in a discussion on SEO for single page Javascript apps. The question raised and the concern expressed was "What if I build my website in for example Backbone.js/Marionette.js, and I need SEO - is Google able to handle it?".

After giving it a little thought, the truth is that, AFAIK, at this point in time Google's crawler engine hasn't been known to handle Javascript extensively well. There is a limited support for processing Javascript, and they say they’re improving on it (article dated April 2012), but how much can we rely on it?

On the other hand, within the last two years, client-side MVC has risen enough that people are considering using it not only for real-time web applications, but even for the request-response paradigm web sites, just to use the available power of client devices and offloading some work off of servers.

Where's the problem?

Given the situation, there is a reasonable concern those client-side driven web sites be penalized by missing the SEO, just because Google is not indexing them properly.

What’s the reason for that? With HTML 5 we got push state allowing us to change URL dynamically on the client with the proper history recording, so - the technology is ready. Even more, Google has a documented support for 'hashbanged' URL schemes which are recognized as unique URLs.

Why is it then that we need to worry about client-side apps, although they've quite deeply entered the mainstream?

A solution exists out there

The situation is best depicted by the move that Meteor has made back in August 2012, by developing a separate module for serving server-side rendered content for crawlers. 

What they do is actually pretty simple and at the same time pretty clever - when a crawler is detected, they pass on the request to the Phantom.js headless browser instance present on the server, and pipe the resulting stream (a rendered HTML page) output back through the web server as a response.

Graphic by Katharina Probst

This is aligned with the proposal Google has made back in 2009(!) on how to make AJAX crawlable.

What's stopping Google from doing the same?

In the meantime, no real action has been recorded from Google on tackling that problem internally. Why should developers worry about it? Or rely on framework providers, hoping that their framework of choice will have this workaround built in.

Especially when the solution isn’t really that complex - what they could do is exactly what Meteor is doing - crawler could run sites through a Phantom.js instance, and feed the indexing engine with the output stream from it.

The question is – is Google actually doing something like this already, or do they plan to in the foreseeable future? If anyone has more information on the topic, as well as your thoughts on the subject, please share it in the comments.

Javascript apps are the web of today, and the web of the future. Developers should not worry about being crawled properly, crawlers need to catch up!

Story comments:

blog comments powered by Disqus