John Andrews is a Competitive Webmaster and Search Engine Optimization Consultant in Seattle, Washington. This is John Andrews blog on issues of interest to the SEO community and competitive webmasters. Want to know more?

johnon.com  Competitive Web & SEO
April 11th, 2007 by john andrews

Advanced SEO, Apache Bug, and Google

Continuing on the concept of “advanced SEO“, I am today marveling at the collision of Apache web server and SEO. It is finally happening, and it is about time.

Continuing on the concept of SEOs Dealing in Minutia, I am also marveling at how this small problem with Apache has gone un-noticed for so long in SEO world.

First, let me say that Beverly Hills is a very nice place to live. Beverly-Hills is just as nice, and Beverly+Hills is even nicer. However, Beverly%20Hills is not quite as nice, although I certainly understand the urlencoding that led to the sub-standard living conditions. To be fair, Matt Cutts has said it’s better than Beverly_Hills, but my SEO senses tell me that, too is changing. Of course Beverly Hills and beverly Hills are the same, as are beverly Hills, and even the easily-parsed beverlyHills with or without proper Beverlyhills capitalizations. Oh if it were so easy. Too bad we can’t all live in easily-parsed locations like BeverlyHills, LaJolla, and NewYorkCity. We wouldn’t need fences. Life would be easy.

But the only reason those places are easily parsed is because they are space-separated place names in the corpus of information that is the index. The URL is the anomaly. Google needs the space-separated HTML out there in order to know that /beverlyhills/plasticsurgeons.html is semantically equivalent to /Beverly Hills/plastic surgeons. The first chicken was indeed born of an egg from other than a chicken. Check Darwin’s notes on that.

But as the advanced SEO positions his search engine friendly URLs in Beverly Hills neighborhoods, he runs into this recognized Apache bug, which reveals that Apache does some escaping of its own before the rewrite engine even gets the URL:

At the early beginning, when the internal request processing starts, apache unescapes the URL-path once. This is not done by mod_rewrite, this happens before mod_rewrite is involved and I think this is also a part of the security concept.

If you are using your rewrite rules in directory context, you have a filename (a physical path, e.g. /var/www/abc) while the per-dir prefix is stripped (so you’re matching only against the local path ‘abc’ if your rules are stored in /var/www/). How would you map some unescaped URL-path to the file system? There’s no way to make the unescaping process optional for a physical path in directory context.

In other words, your Apache rewrite rules have a good chance of choking when you are working with front controllers and virtual URLs that don’t map to archaic web server file system structures like files and directories, especially when you get into international character sets and AJAX/javascript limitations. Apache is assuming that files and folders are different from query strings. Different enough in the way they may or may not have escaped characters. Apache is unescaping part of your URL outside of your influence, before the rewrite conditions are tested. So if you set your controllers to have search engine friendly names, and perhaps were foolish enough to think you could create them dynamically from actual data by encoding to meet W3C specifications, you have a problem.

This is not a show stopper. If you’re building front controllers you are capable of avoiding Apache’s rewrite altogether (and may now recognize this as a necessity), but it sure is inconvenient if you had planned out a site architecture with an eye on a nice, stable, data-driven virtual URL hierarchy using your own front controller in collaboration with Apache’s fast, integrated mod_rewrite.

The bug reports describe this in more detail, and show a few ways to work around the problem if you are so inclined to do so (that is, if you are so inclined to revisit your code once Apache gets pached again).

My point? This ain’t beginner-level SEO, friends. Pursuing Google-friendly URLs with a modern web infrastructure, and running into a bug in Apache? THE web server? And not just a bug, but one that demonstrates how Apache’s roots are in file systems, which we left behind a few years ago when we started using CMS’s and frameworks. If you’re moving a large dynamic site to a more “search engine friendly” site architecture with semantically useful URLs, you’re client just got a change order and a work authorization form. And the first order of business on that agenda is not working around the problem. It’s revisiting a cost-benefit analysis. If such an obstacle is too big for your SEO boots, what will you do? Settle for good-enough? I won’t. Certainly not in Beverly Hills.

In 2002 I implemented a method of statically caching a sitemap via use of a primitive front controller with hooks into Apache’s 404 handler. The goal was the same as it is today - user and search friendly URLs with no physical file system correlates, and fast, clean structure. I presented it to a technical audience in 2005, and was asked why it was necessary at all. Even now, 5 years later, working with frameworks and application programming languages far advanced from the old days of PHP3 and 4, we have the same problems: Google gives weight to things it can’t manage properly, and everything’s running on a web server built a long time ago when “things was different”. SEO is about optimizing content as published, so that it ranks in search engines. As long as the web keeps changing, and Google does or does not, SEO will be hard.

As far as the Apache/SEO collision thing, it’s not so much this bug as the source of it: Apache protects an underlying file system, and I just don’t have any need for a file system any more.

If you’re into the minutia, here are some links:

http://issues.apache.org/bugzilla/show_bug.cgi?id=23295
http://issues.apache.org/bugzilla/show_bug.cgi?id=32328#c12
http://us2.php.net/manual/en/function.urlencode.php#73641 (see comments)
Me at IBM NYPHP presenting on SEO
Me again at IBM New York PHP on SEO

★★ Click to Share!    Digg this     Create a del.icio.us Bookmark     Add to Newsvine
April 6th, 2007 by john andrews

Does “Advanced SEO” Even Exist?

I recently referred to some aspects of SEO as “advanced SEO” and Jill Whalen commented that she didn’t think any SEO could be “advanced” SEO.

When I sit in a session at PubCon, the “SEO” panelists repeatedly say things like “you need to make your title tags unique” and “your non-www needs to 301 to your www” and I get so bored I choose to sit next to IncrediBill, just to keep things interesting. But when I work with SEO and AJAX, I get a headache from the depth of the challenge.

Is there such a thing as “Advanced SEO”? I think so, and I will go into more detail next week (I will also enhance my SEO AJAX blog post with more detail). In the mean time, what thinks you? Comments are enabled below.

★★ Click to Share!    Digg this     Create a del.icio.us Bookmark     Add to Newsvine
April 4th, 2007 by john andrews

Marketing Experiments free A/B Split Testing Tool: Why Google?

[Update 4/6/07]: The noted artifact has been removed from the web page mentioned. I have no idea why.

Search marketing is an interesting field for sure. Everyday we get to spend up to half of our time learning, and it’s almost always goal-oriented learning. I find it immensely satisfying. Of course the more you learn about SEO and Search Marketing, the more you understand where the money is made, where the real opportunities are, and where the tin foil hat is required technical gear, like a good waterproof jacket when outdoors in the Pacific Northwest.

So when I was learning from the Marketing Experiments web site this morning, I started to wonder, why “Google”?

More specifically, I wonder why alt=”Google” on the burst icon for “Free A/B Split Testing Tool” in the upper left of this page?

Marketing Experiments offers an online course for optimizing landing pages, and the email promotion lands on this page here. On that page, the premier highlight is the upper left yellow burst, with the words “Free A/B Split Testing Tool“. Hover and you’ll see the ALT text for that image is “Google”. There is no anchor tag for the image, so no click thru.

So why “Google”?

A company that sells itself as an authority in optimizing landing pages, and even offers $595 online courses to teach you to build optimal landing pages, would surely have a purpose for the alt text assigned to the premier highlight on their own sign-up-for-our-course landing page, right?

Everyone who has been doing split testing over the past few years knows that Google has come out with an A/B split testing tool, Google Website Optimizer. It’s in beta but available, and “free” (that’s “free as in beer” when the hospitality suite requires an invitation and an RSVP with full disclosure of your personal details). Anyway Google’s A/B Split Testing Tool is free. Previously, real split testing tools cost money, and split testing is a service-oriented industry requiring experience and expertise. Google Optimizer has the potential to do to the commercial split testing tool market what Google Analytics did to the commercial analytics software market : scare it big time.

So why the “Google” alt text on the marketing Experiments landing page?

  • nearby text association; a chance to get some alt text spam in there using the word “Google” (this is so small-time I don’t buy it)
  • Marketing Experiments is actually giving you Google’s free optimizer as the “Free A/B Split Testing Tool” (seems equally oddball)
  • ?

Google’s Website Optimizer was initially a tool for working with Google AdWords, but is now promoted as a full multivariate testing tool that will work alongside non-Google analytics software:

Website Optimizer, Google’s free multivariate testing application, helps online marketers increase visitor conversion rates and overall visitor satisfaction by continually testing different combinations of site content (text and images). Rather than sitting in a room and arguing over what will work better, you can save time and eliminate the guesswork by simply letting your visitors tell you what works best.
Free multivariate testing

Website Optimizer is a self-service application designed to give marketers full control over testing. Not only does Website Optimizer - integrated into AdWords - test messages on all site traffic (not just AdWords traffic), but it also works alongside Google Analytics and all third party site analytics packages.

I have no doubt Marketing Experiments is not scared by Google Website Optimizer. They sell services and training, and Google’s entry is likely to be very good for the market as it raises the bar for basic performance standards, and raises the profile of testing like almost no other company could do single handedly. Which makes me wonder even more, why alt=”Google”?

★★ Click to Share!    Digg this     Create a del.icio.us Bookmark     Add to Newsvine

Competitive Webmaster

Wonder how to be more competitive at some aspect of the web? Submit your thoughts.

SEO Secret

Not Post Secret

Click HERE



about


John Andrews is a mobile web professional and competitive search engine optimzer (SEO). He's been quietly earning top rank for websites since 1997. About John

navigation

blogroll

categories

comments policy

archives

credits

Recent Posts: ★ Cloud Storage ★ Identity Poetry for Marketers ★ PR is where the Money Is ★ Google is an Addict ★ When there are no Jobs ★ Google Stifles Innovation, starts Strangling Itself ★ Flying the SEO Helicopter ★ Penguin 2.0 Forewarning Propaganda? ★ Dedicated Class “C” IP addresses for SEO ★ New Domain Extensions (gTLDs) Could Change Everything ★ Kapost Review ★ Aaron Von Frankenstein ★ 2013 is The Year of the Proxy ★ Preparing for the Google Apocalypse ★ Rank #1 in Google for Your Name (for a fee) ★ Pseudo-Random Thoughts on Search ★ Twitter, Facebook, Google Plus, or a Blog ★ The BlueGlass Conference Opportunity ★ Google Execs Take a Break from Marissa Mayer, Lend Her to Yahoo! ★ Google SEO Guidelines ★ Reasons your Post-Penguin Link Building Sucks ★ Painful Example of Google’s Capricious Do Not Care Attitude ★ Seeing the Trees, but Missing the Forest ★ Search is a Task; Discovery is Fun ★ Why “dot everything” is a Good Idea (and ahead of its time) 

Subscribe

☆ about

John Andrews is a mobile web professional and competitive search engine optimzer (SEO). He's been quietly earning top rank for websites since 1997. About John

☆ navigation

  • John Andrews and Competitive Webmastering
  • E-mail Contact Form
  • What does Creativity have to do with SEO?
  • How to Kill Someone Else's AdSense Account: 10 Steps
  • Invitation to Twitter Followers
  • ...unrelated: another good movie "Clean" with Maggie Cheung
  • ...unrelated: My Hundred Dollar Mouse
  • Competitive Thinking
  • Free SEO for NYPHP PHP Talk Members
  • Smart People
  • Disclosure Statement
  • Google Sponsored SPAM
  • Blog Post ideas
  • X-Cart SEO: How to SEO the X Cart Shopping Cart
  • IncrediBill.blogspot.com
  • the nastiest bloke in seo
  • Seattle Domainers Conference
  • Import large file into MySQL : use SOURCE command
  • Vanetine's Day Gift Ideas: Chocolate Fragrance!
  • SEM Rush Keyword Research
  • ☆ blogroll

  • Bellingham SEO
  • Domain Name Consultant
  • Hans Cave Diving in Mexico
  • Healthcare Search Marketing
  • John Andrews
  • John Andrews SEO
  • SEMPDX Interview
  • SEO Quiz
  • SEO Trophy Phrases
  • SMX Search Marketing Expo
  • T.R.A.F.F.I.C. East 2007
  • TOR
  • ☆ categories

    Competition (39)
    Competitive Intelligence (15)
    Competitive Webmastering (544)
    Webmasters to Watch (4)
    domainers (63)
    Oprah (1)
    photography (3)
    Privacy (16)
    Public Relations (187)
    SEO (395)
    Client vs. SEO (2)
    Link Building (3)
    Search Engines vs. SEO (1)
    SEO SECRETS (11)
    SEO vs. SEO (1)
    ThreadWatch Watching (5)
    Silliness (24)
    Social Media (7)
    society (31)
    Uncategorized (23)

    ☆ archives

  • December 2013
  • October 2013
  • September 2013
  • August 2013
  • May 2013
  • April 2013
  • March 2013
  • February 2013
  • January 2013
  • November 2012
  • September 2012
  • August 2012
  • July 2012
  • June 2012
  • April 2012
  • March 2012
  • February 2012
  • January 2012
  • November 2011
  • October 2011
  • September 2011
  • July 2011
  • May 2011
  • April 2011
  • March 2011
  • January 2011
  • December 2010
  • November 2010
  • October 2010
  • September 2010
  • August 2010
  • July 2010
  • June 2010
  • May 2010
  • April 2010
  • March 2010
  • February 2010
  • January 2010
  • December 2009
  • November 2009
  • October 2009
  • September 2009
  • August 2009
  • July 2009
  • June 2009
  • May 2009
  • April 2009
  • March 2009
  • February 2009
  • January 2009
  • December 2008
  • November 2008
  • October 2008
  • September 2008
  • August 2008
  • July 2008
  • June 2008
  • May 2008
  • April 2008
  • March 2008
  • February 2008
  • January 2008
  • December 2007
  • November 2007
  • October 2007
  • September 2007
  • August 2007
  • July 2007
  • June 2007
  • May 2007
  • April 2007
  • March 2007
  • February 2007
  • January 2007
  • December 2006
  • November 2006
  • October 2006
  • September 2006
  • August 2006
  • July 2006