John Andrews is a Competitive Webmaster and Search Engine Optimization Consultant in Seattle, Washington. This is John Andrews blog on issues of interest to the SEO community and competitive webmasters. Want to know more?

johnon.com  Competitive Web & SEO

Google Spiders Javascript and CSS files? Of course…

There’s an active debate about Google’s spider requesting CSS files, and a (baseless?) suggestion that it may be looking for hidden text. I suppose this is a slow news day?

Google wants to “organize the world’s information” and will request from your web server whatever it wants. Google says it follows a strictly proper robots.txt file, so if you don’t make your CSS files hands-off, Google can grab them. Why would Google look at CSS? Many reasons.

CSS files reveal a ton about your web page (especially when you use asynchronous web services) that Google otherwise would not know about. Looking for display:hidden is child’s play. I doubt that’s on the agenda except for specific cases such as research on dodgy domains.

At the very minimum one can grade quality simply by looking at the CSS file. At more extreme edges of the modern web, the CSS file may contain most of the AJAX activity… especially as Behavior Driven Development takes root. Again, I doubt it’s on the agenda right now, but we are already past the time when companies like Google should have started spidering external CSS files and js.

One guy suggests cloaking Googlebot when it asks for external CSS files. Not wise IMHO. Cloaking is defined by Google as “bad”, so if you do it, you can be labeled as “bad”. If you use display:hidden you can only be labeled as “capable” or “advanced”. Don’t compromise plausible deniability for this. It’s not worth it.

My gut says this is one of the PR aspects of Google’s anti-SEO efforts. Go hit a bunch of js and external CSS files every once in a while, and generate some buzz as a deterrent to widespread adoption of display:hidden and way-way-left-off-page hidden text. That’s alot easier than actually parsing and categorizing DHTML.

Once again, if you wonder what matters in Google, hit the SERPs instead of the blogs. And since you know to diversify your holdings, even when something new is implemented, you’ll be ok.

★★ Click to share this article:   Digg this     Create a del.icio.us Bookmark     Add to Newsvine

12 Responses to “Google Spiders Javascript and CSS files? Of course…”

  1. Jeremy Luebke Says:

    People actually throw up hidden text these days? For some reason Prince’s song Party Like It’s 1999 keeps popping in my head. There are so many more creative ways to get text onto a page for SEO purposes. My favorite is CSS tabbed content like the news box on Yahoo.

    I have another theory on why they want CSS files. I can’t find the docs, but some university published papers on how by having algo’s render page layout, the SE can determine what is actual content and what is side bars, headers, footers, and so on.

    I could see Google doing something like this in an attempt to discount links not associated with content. After all, 99% of the links bought through link networks are put in footers and sidebars.

    Just a thought.

  2. stever Says:

    Thank heavens for that, Jeremy. I was sitting at TW thinking that I must be missing something. Even in the days of Like Doves Cry, you could still stick anything you wanted in a scrolling div.

    And, as you say, you now have tab divs, tooltip divs and all manner of ideas to keep whatever content you desire on the page. If indeed you lack the creativity to actually put whatever it is you are convinced you need in full view…

    By the way, John, nice job with the blog which has finally prompted me to delurk – I’ve been meaning to start a grumpy old bastard SEO site, but you fill the balloon-puncturing role far more elegantly than I could.

  3. john andrews Says:

    Jeremy,

    When it comes to implementation, Google is a scientific company and these things will be approached digitally/methodically. Rendering pages is not necessary… while human logic would say “render page – analyze page – see what’s what” you actually do that as research, in advance. Then you characterize the bits that you want to eliminate (sidebars, footers) based on a sample, and you build a model based on that knowledge. Once so armed, a quick check of the external CSS against the model answers the “does it fit the model, or not” question without all that work of rendering. The quality team gives the feedback of whether that model improved the SERPs or not… which leads to model refinement. It’s the same old Google algo development process… baby steps, but smart baby steps.

    I totally agree links are a prime target for such knowledge. Sitewides, navigation, and link lists should weigh less than inline citations.

    Funny thing is, some people keep saying SEO is Dead yet every time I look Google is adding more data to the process, which means so need we.

    Stever: um…thanks? Is this Stever from wmw, btw? Oh, and I’m not old, damn it.

  4. stever Says:

    Does SEM iconoclast sound better? And yes, that’s me…

  5. John Andrews Says:

    “iconoclast” wow that’s great. So much better than “critic”. One guy said I was the anti-seo blog. That was clever.

    It’s just gut talk, really. I’ve no agenda against anybody, but I do admit the cheesy rip offs portrayed as legit “opportunities” and the blatant guesses presented as pluasible theories do irk me. I understand everyone has something to say and a right to say it, but that’s what memes and livejournal are for, not SEO blogs.

  6. IncrediBILL Says:

    One guy? so now I’m “One guy”?

    Besides, Google says don’t cloak CONTENT and I never knew CSS was defined as content. They may want to update their definitions.

  7. Google and CSS Files, It’s Not What Your Thinking | Marketing Pilgrim Says:

    […] So why do they want CSS and JavaScript files? They most likely want JavaScript files to detect JS redirects and other malicious code. With the CSS files, we have to think about what Google’s top priorities are these days. If I where Google, at the top of my list would be defending the integrity of link popularity as it’s what their algorithm is built upon. By combining the html, css, and javascript, Google can determine what sections of code are content and what sections are headers, footers, and sidebars. It was my theory that Google might actually be rendering the pages on the fly to make this determination, but John thinks they can do it without rendering as long as they have the files (read comments). Either way, I find it much more likely that Google’s using these files to determine how to weight links rather than trying to find hidden text. […]

  8. john andrews Says:

    Sorry Bill. I guess it was two guys http://www.searchenginejournal.com/?p=4006

  9. IncrediBILL Says:

    Well that guy thinks you share SEO secrets.

    Having met you face to face, and liking you, I still suspect you’re sharing SEO misdirects and sending other lame ass losers off into the weeds of SEO.

    Not that I distrust you John, but being a competitive webmaster I take it all with a grain (box, oh hell, a ton) of salt :)

  10. IncrediBILL Says:

    FYI, I think John’s a great guy. However, just sit back and digest his posts before you take action on the information. Sometimes you can take John’s posts at face value, sometimes they are misdirected, he has a wicked sense of humor and irony so beware, but they are always intellectually stimulating regardless of where they lead you.

  11. John Andrews Says:

    What’s the matter Bill, getting confused? I know SEO is not your specialty….

    I’d be interested in knowing what you think is “mis-directing”. Please share.

  12. IncrediBILL Says:

    What do you mean SEO isn’t my specialty?

    Seo for eCommerce was my specialty for a few years.

    I retired due to cancer treatment and lived off my SEO skills for several years now so I don’t think I did such a BAD job as I’m still here and not poor. Others may differ in their opinion, but my lights are still on, my booze closet full, and I’m living fat and happy.

    Maybe misdirecting was a bad term, but you tend to discuss things from a perspective that could lead someone in multiple directions, we’ll leave it at that.