John Andrews is a Competitive Webmaster and Search Engine Optimization Consultant in Seattle, Washington. This is John Andrews blog on issues of interest to the SEO community and competitive webmasters. Want to know more?

johnon.com  Competitive Webmastering & SEO
March 29th, 2007 by john andrews

Google Finds Unregistered Whois Data

A few weeks back I was Matt Cutts Watching when I noticed he was participating in a Law Bloggers meeting. Mental note made; move on. Now I see Matt report from his blog about the Bay Area Blawgers meeting. Best bit: Matt notes how the US Copyright Office houses a database of domain names associated with registrations for Online Service provider status. In order to technically qualify for the Safe Harbor provisions of the DMCA, a company or web site must register with the copyright office. That registration costs $80, and includes a place to name the business and list alternative names for the business. In other words, it’s a self-registered list of domain names owned/operated by a legal entity, identified in the public records.

Now Matt didn’t identify it as such… that’s what you need me for :-) Matt simply commented on how Kurt Opsahl of the Electronic Frontier Foundation polled the table about DMCA takedown notices, and pointed out how easy it was to register as an Online Service Provider. But if I were Matt, and that was news to me, I would take a look at that US Copyright web site and when I saw page after page of webmasters listing all of their “other domains” I would say aaaahhhhh…. and fire off an email to a junior Googler to “organize this information”.

If you look around the US Copyright filings for Online Service Provider you will see many, many webmasters listing dozens or more domains under one registration. Some of the big boys also list domains together, while others seem to register single domains. Warner Bros Entertainment for example listed entertaindom.com and conspiracytheory.com, which have WHois records assigned to Warner Entertainment, but they also included orgymusic.com on the same registration form, which has a Whois registrant of Astro America, LLC in San Francisco. Nice find for Google, as this allows Google to associate orgymusic.com with Warner Brothers when, based on Whois alone, that was not obvious public knowledge. That’s just one example for you.

I am sure it would be fun to dig around the site further, but I don’t have time. Besides, Google can make it searchable and cross-reference-able, which would be MUCH easier the scanned PDFs. I am sure Google could ask for this information direct from the copyright office in electronic form,or perhaps make a trade of indexing for access. In many cases this makes a nice addendum to Google’s efforts to associate web sites to each other and webmasters to web sites (via email address at the very least, names and legal representatives, etc). Right now it looks like mostly big corps and adult web sites, but it seems clear that this is a likely addition to the legal requirements for web publishers and so … another tool for Matt’s Top Secret Spam Fighting VPN-connected laptop computer!

I just hope that when Google organizes this part of the world’s information, it makes it accessible to all of us and not just the competitive teams inside Google. Maybe that’s a good idea for the Copyright office… if you give away our data, require that the indexed form also be freely available to the public? Maybe a Taxpayers Content License or something?

Topical Tags:
★★ Click to Share!    Digg this     Create a del.icio.us Bookmark     Add to Newsvine
March 29th, 2007 by john andrews

SMX Advanced Seattle Search Marketing Expo

Updated: The SMX Advanced Seattle 2007 Agenda has been updated. One interesting thing I did not see before: Matt Cutts and Michael Gray on the same panel, discussing Google’s use of personalization, considering whether or not it is a threat to webmasters….. Hmm…. that could get colorful. I suppose it’s up to Michael just how colorful it gets?

I also see Greg Bozer and his brother Todd are running a session together (again). They’re presenting two sides of an issue “Is SEO Bull?”, but I only see Greg and Todd. Who’s presenting the other side? Maybe it’s a mystery guest?

Anyway I see that I am ranking highly with this post for “SMX Seattle” and “SMX Advanced” and that doesn’t seem like the best thing for Google users since I didn’t even link out to the conference website initially. So in the interest of helping Matt Cutts and the Spam Quality Team at Google deliver comprehensively helpful SERPs, here are the updated links for the SMX Advanced Seattle 2007 Agenda:

(nofollows in place since I have never met this Danny Sullivan guy… and after last week’s disclosures, best not cause Google to think these are paid links, eh?).

—————————————————————————————————

I have been considering attending Search Marketing Expo (SMX), mostly because it’s here in Seattle. For some reason I thought differently of it than SES, but now that I see the agenda I’m getting deja vu. I feel very much the same way I felt all those times I considered attending SES (and chose not to). I’m thinking it’s pretty expensive at $1200+ for shy of 12 hours of content (over $100 per hour). It also doesn’t seem to really be for SEOs?I see the conference is billed as for “advanced” search marketers:

SMX Advanced is for the experienced search marketer who wants to enjoy sessions conducted at a high-level and continue to stay ahead in the fast changing world of search. If you’re fluent in search marketing, SMX Advanced is where you can converse with others who speak your native language.

I’m not a cheapskate but I am very particular about where I drop $1200 just for registration for a 2 day industry event that schedules concurrent sessions. I initially thought I would like to participate, but the open topics don’t really fit my core interests so that’s out. Am I alone in this view that SMX is not for experienced SEOs?

Looking at the agenda, and describing my thoughts:

Organic Track: Duplicate Content Summit
More and more, SEOs are growing concerned about duplicate content issues. Does syndicating your content in feeds mean you give up being seen as the original source? Is content scraping that’s out of your control going to knock you down in the rankings? In this session, search engines outline how they currently handle duplicate content detection, followed by lots of time for the audience to suggest and explore future directions.

Duplicate Content is a very logical issue, but very well covered over the past few years in SEO world. I don’t agree that “more and more” SEOs are concerned these days, although I will agree that “more and more” SEO web sites write about it these days. SEO web sites write about what gets traffic, and not what’s important (this is billed as an “advanced” conference, remember?). People want to know how to avoid dup content because they have heard it mentioned so much over the years. In my circles, duplicate content is less and less an issue these days (partly because it’s so easy to understand the core issues, and easy to avoid the big problems). I also find duplication less of a problem these days when it comes to impact on rankings (fewer 302 problems, for one thing).

I do agree that the search engine companies could present interesting material regarding how they handle duplicate content, if they are willing to disclose more than what Google already discloses. But in the SMX blurb, they don’t say much about which or how many search engine companies will participate. Plus, they emphasize “followed by lots of time for the audience to suggest and explore future directions.” I really don’t want to hear Ask.com or Microsoft to tell me how they handle duplicate content. I’m also not keen on paying over $100 per hour in registration fees to listen to the audience ponder the unpredictable future that exists at the whim of the search engines.

Organic Track: SEO, Meet SMM
SEO has a lot to gain from SMM, social media marketing. Getting your content into the major social media sites does more than provide an initial traffic jump. It can generate links or provide rankings you might not be able to tap into with your own site. In this session, SMM essentials that SEOs need to know.

Now I know social media had to be in there to satisfy the SocialMediaOptimization people (and their prospective customers), but is social media marketing really SEO? It’s search marketing, yes. But search optimization? I think not. It’s a traffic source, and data on the value of that traffic (perhaps relative to organic SEO traffic) could be interesting. Will such data be exposed? Generate links… yes, just like with any web site, getting coverage on a social web site can generate links. But “provide rankings you might not be able to tap into with your own site“? Really? Separate from the impact of those links? Hmm. the SEO in me is thinking “cool… someone’s going to actual discuss the impact of user tracking on SERPs” and then I check myself. Not in an hour, and not in a session entitled “SEO meet SMM”.

Organic Track: Personalized Search: Fear Or Not?
Google’s change earlier this year to make personalized search results more prevalent has many SEOs wondering — is it game over when everyone has their own unique search results? This session looks at the shift, tips on staying high even with personalization and what might come in the future.

Reading this one my gut feelings are reinforced. It sounds like overview coverage of the potential impact of the concept of personalization. Again, the language of the snippet suggests it’s newbie coverage, not advanced search marketer stuff. Of course I could be wrong… that’s why I am writing this on my blog! But if I were organizing a session that was truly advanced coverage, I’d mention the topics so people would know .. like stats on penetration, utilization, effectiveness. But then this is included in the “organic” track (?). If the conclusions suggest that social marketing, encouraging bookmarking, feed listing, etc is a way to maintain rank in the face of personalization (such as might be suggested) then I will be disappointed because I believe social marketing is marketing and not SEO. The “fear or not” part of the session title doesn’t help me either. That’s a title aimed at clients and newcomers, if you ask me.

Organic Track: Penalty Box Summit Had a site hit the search engine penalty box? In this session, search engines share the latest on how they give you official signs of this, along with re inclusion procedures. The session includes lots of time for audience-driven discussion on penalties and how procedures might be improved.

Now here’s another example were my SEO alarms go off. In SEO, penalty recognition is either very easy or very hard. Those who can definitively identify a not-easy-to-identify penalty situation are very much in demand these days, and they are not likely to give away their secrets at a public session like SMX Seattle. The rest of the penalty situations are fairly easy to see (and correct). “Re inclusion procedures” is clearly newbie territory. And, once again, that mention of “lots of time for audience-driven discussion on penalties” makes me cringe… $100 per hour to listen to do-it-yourself search marketers explain their seemingly unfair penalizations. Nope.

Organic Track: Better Ways To Do The Boring Stuff
Keyword research. Link building. Page titles. Yawn. You know the fundamentals of SEO cold, and c’mon — they aren’t always that exciting. This session gets creative, opens your eyes to new ways to make the drudge work less drudgery.

Okay I see some value in that one, as long as it’s not just a pitch session for commercial SEO products or tools.

Organic Track: Give It Up!
No more secrets time. In this session, our panel of noted SEOs all share some of their favorite and largely overlooked SEO tips. Then we turn to the audience for more sharing. Attendees vow not to blog what’s discussed (on your honor now!). Matt Cutts and his mighty notebook might be barred from the room. Alternatively, any search reps found lurking have to give up a secret of their own or head for the hallway.

What panel? Who are the “noted SEOs”? Really… that is *everything* for an “advanced” session like this, because not many of the talking heads of SEO these days actually reveal meaningful “secrets”. I would expect an inaugural meeting to proudly proclaim the headliners for such an “expert” driven session, but I don’t see it here. Zero-day secrets are a lot of fun and don’t last long, so they tend to be more entertainment than anything else. I guess when it comes to “advanced search marketer” stuff, I’m simply not sold.

Advertising Track: Paid Search Roundtable
Get updated on the latest from the major paid search providers, then fire off questions on paid search topics to the panel of representatives during the ample discussion period.

Not for me. Paid search management is a trade, not a profession. It’s owned by the search engines, and infinitely dynamic because the networks can update their behavior at any increment. This one’s for PPC practitioners.

Advertising Track: Paid Search & Tricky Issues
Trademarks, duplicate listings, quality scores, match types, getting fast support — these are just a few of the tricky issues with paid search. This session covers such topics and solutions to make your life with paid search easier.

Again, a session for those in the PPC trade who don’t already have contacts to answer the questions. Not for me.

Advertising Track: Getting Vertical With Paid Search
As search goes vertical, so too have the ad opportunities. Local search ads and mobile search ads are just two vertical search marketplaces now out there. This session looks at some key verticals with tips and opportunities. Don’t miss out on these new frontiers of paid listings!

For the same reasons that PPC is not interesting to me, this is not interesting to me.

Advertising Track: Pump Up Your Paid Search!
In this session, tips and techniques designed to help pros get even more out of their paid search campaigns.

Again, more for the PPC practitioner. It’s not SEO, and PPC is a profit-managed enterprise, so anything spoken aloud to an audience of hundreds at an industry event like this is not of value to me. If you’re managing your own PPC or learning to be a manager of PPC, or perhaps are already a small PPC manager, fine, but then why is this for the “advanced” search marketer?

Advertising Track: Paid Search: The Giant Focus Group “I wish….” or “If only they would….” If you’ve thought it, now’s your chance to say it to representatives from the major search ad providers. What should they fix? What new features should they provide? Come lobby for the changes you want, with others to take up your cause, if you’ve got that great idea.

The day I willingly pay over $100 per hour to be in someone else’s focus group, please shoot me in the head.

Advertising Track: Beyond The Majors
Still seeking paid search traffic? Then you might want to look beyond the major players and toward some of the smaller networks out there. This session provides an update on options beyond the majors, as well as tips and strategies.

Of minor importance in my book.

Debate: Is SEO Bull?
Want success in SEO? You don’t need no stinkin’ SEO! Just read the search engine help files, have good content, and the traffic will flow. That’s the bull argument. The no bull side says SEO is indeed a skill set that not everyone has time or aptitude to learn, one that can deliver those targeted visitors. We let the sides go at it through a traditional debate, followed by audience discussion.

I can see this as a great opportunity for the people on the panel, provided the audience is full of potential clients on the fence about SEO (but not for “advanced search marketers” attending in the audience). It also might be very entertaining.. but I can’t tell because I don’t know who is on the panel. I’ll have to pass.

Debate: Is Bid Management Dead?
With it harder to know what the competition is paying plus quality scores that make it difficult to know where you’ll rank, are things shifting away from automated bid management and more toward the human touch? Both sides square off in a traditional debate, followed by audience discussion.

Again, PPC is a managed profit game, so the answer to this is “duh!”. ANY technological approach to optimization will be neutralized eventually if it removes profits from the networks or presents a recognized challenge to the other market players. If you are serious about PPC you pay the best to deploy the best tools today for your cause, and thus manage your risk. If you’re not serious, PPC is a very wasteful spend IMHO, so just manage the budget/risk as you see fit. I can see learning just how bad it is right this moment, or hearing what others are doing, but that is not for me.

To be fair there are two 45 minute sessions still listed as “TBA” which I assume means “to be annouced” but also, to be fair, the price increases to the full $1200 on April 2nd which is not far off. I can’t assume those two will be killer sessions that make it all worthwhile, with no such promotion, can I?

So SMX Seattle looks to me like it’s more for clients of SEOs than experienced SEOs, and more for inexperienced search marketers than experienced search marketers, except as an opportunity to present or otherwise impress and recruit clients. Since I am not presenting (and this post is not likely to get me any invitations!), it’s not for me.

Of course I recognize I have not at all considered this for it’s social networking value (which might be where all of the value is). That said, what does SMX Seattle add that PubCon and SES don’t already provide?

Topical Tags:
★★ Click to Share!    Digg this     Create a del.icio.us Bookmark     Add to Newsvine
March 28th, 2007 by john andrews

Matt Cutts MFA Splog

I’m not sure why someone thinks that Matt Cutts’ blog content is good splog fodder. SEO and webmastering don’t generally monetize at the top of the earnings charts for PPC, and without attribution to Matt, you don’t get the Matt Cutts name on there for poularity either. Maybe this full-text reproduction of Matt Cutts blog , plastered with AdSense ads, is intended to be ironic?

 

 

Topical Tags:
★★ Click to Share!    Digg this     Create a del.icio.us Bookmark     Add to Newsvine
March 26th, 2007 by john andrews

“Brett Tabke Hates Me”

“I had people come up to me at Pubcon and ask me why I was not a speaker. I told them its cause Brett Tabke hates me.” - Jeremy Shoemoney on his blog.

Wow. I loved that comment! It’s classic for SEO world. Looking past the obvious “come on Brett, can’t we make nice” aspect of the public statement, Jeremy is expressing a common experience around PubCon and webmasterworld. But it might not be hate. I think it’s partly due to geekiness. It’s partly due to ego (on all sides?) but also partly due to the self-conscious independence that Internet entrepreneurialism allows.

When geeks are in charge, they behave differently than regular people in charge. And I don’t mean modern day geeks (those raised in a world where Geek is fashionable). I mean pre- South Park days, when Geek humor was restricted to sysadmin caves, Popular Mechanics was Make and no one laughed genuinely with an unrestrained open-mouth smile except salesmen and Geeks (and the salesmen were smart enough to shower daily and wear clashing plaid outfits for the distracting comedic value). Back in the day of Brett’s rise to tech geek status, Geeks were ugly. Remember “Revenge of the Nerds?” It was actually funny when it came out. Really!

Today I suspect it’s like Brett brought the only baseball and he’ll take it home if you don’t want to play by his rules, even if he wants to change them in the top of the ninth. Does he hate the players who disagree with him? Nah… but maybe he thinks he just doesn’t have to let them play with his ball. Maybe it’s just Brett’s own rendition of Revenge of the Nerds. Maybe PubCon doesn’t need to be as excellent as some say it could be. Maybe PubCon is good enough for Brett.

Why did I write this? In ‘02 I had a less than pleasant experience with Mr. Tabke, and the experience has colored the Pubcon/WMW landscape for me ever since. It might be me realizing my expectations for Brett, but with every new encounter I seem to re-confirm for myself that Mr. Tabkey and I simply don’t speak the same language. It’s like the guy at The Whatever Club you simply can’t stand. You’re both members, you share common friends, you have the same interests, you pursue many of the same goals, and it looks like it would be so cool to hang out and do Whatever together but it simply will never work. But that’s not hate. 

Now will Shoemoney be speaking at PubCon anytime soon? My bet is yes, eventually, but not because he and Brett make up. It’ll be because Shoemoney continues to do what the market appreciates, the market continues to demand speakers who can cover the important topics with authority and in an entertaining way, and Brett is forced to for-go certain personal preferences for economic reasons. And that’s my personal interest in seeing Jeremy speak at PubCon — as soon as I see that, I’ll know that substance finally matters more than personal connections at PubCon, and then maybe the quality of the sessions will improve!

 

 

Topical Tags:
★★ Click to Share!    Digg this     Create a del.icio.us Bookmark     Add to Newsvine
March 26th, 2007 by john andrews

SEM Scholarship - Show Me Your Links

The “SEM Scholarship” contest has been announced, and this year the no-cash-value prize is described as “worth more than $10,000″. I’ll refrain from critiquing the assignment of dollar value to things like “One month of industry coaching and training from search marketing expert Andy Beal” and “The winner’s article will be published in the industry print magazine“, but I suppose free registration to a conference is worth something.

This “contest” is open to “the next generation of search engine marketing experts”. Your submission is judged by how much traffic it brings to the host web site, and then the top 5 traffic drivers get judged by a panel of SEM “judges” for things like “content, style, topic and many other factors“. Since no article will draw contest-worthy organic traffic in the short term of this schedule, it’s really about driving traffic the good old-fashioned way — the hard way. Ask for it, beg for it, or do like last year’s winner and buy it.

And so, in anticipation of the usual requests to “link to my article” and “highlight my article” and “mention my article on your blog”, I am calling all wannabe SEM contestants to step forward now, while it’s still early, and identify your own properties from which you are able to provide links in return. That’s right, I am throwing in, for free, apprentice-like training for SEM wannabes. I guess that has a value of…  skip it.

Anyway, that’s how the real world works — you wants links? Show me what you’ve got in exchange. But do it now, in advance, because honestly once your article is in the hopper the traffic needs to be flowing if you expect to make the top 5. Tick tock, newbie. Better get crackin’.

What sites do you control, where you could place a back link?  Each  site has a theme, and a review will reveal the opportunities. I, like most competitive web masters, SEOs and SEM people, have web sites and clients and partners across numerous niche markets and I can use back links from all sorts of web pages. But it’s not willy-nilly, silly.

Good luck to you all, and I sincerely hope you emerge from this contest with at least a realization that yes, yet again you worked hard for someone else, for free, so they could get traffic and attention to their website, at your expense, and you did it for a chance at a prize that has no cash value, but an opportunity value somewhere between zero and $10,000, depending on how much you would already have paid for the rubber throwing stars and other stuff you were going to buy anyway. If that were the case. Which it is not. Anyway. Good luck. Let me know if you want to exchange links ;-)

Topical Tags:
★★ Click to Share!    Digg this     Create a del.icio.us Bookmark     Add to Newsvine
March 25th, 2007 by john andrews

Where are the Contextual Job Listings?

I write a blog post about PHP, and in the sidebar should be a link roll of PHP jobs.

I write about SEO and in the sidebar should be a link roll of internet marketing jobs.

If I were hiring a web designer, I would target a beautifully rich long tail of attractors for my job listing. I imagine I would appreciate a system that combined these automagically, according to some smart ruleset. I bet, given the vast experience of the contextual advertising engines and the relative uniformity of job offerings, that it would be cake. I bet it would be amenable to optimization, too.

Job click thrus convert as resumes, or at least a conversion lead better than most. And job link click thrus don’t have to go direct to specific jobs… they can go through a lead refinement filter, which, of course, would be like an MFA page, helping to land the job prospect onto the optimal match of a job. “So you like the PHP job, did you see these PHP + MySQL jobs, and these PHP + Perl jobs? Which do you like best (chose one or more…” Taguchi doesn’t apply, cause each lead is unique, and so why rely on initial page context, trying to match perfectly when job seekers expect to seek anyway? That’s why PPC doesn’t pay for individual jobs. Instead, use contextual ads to draw them in… but not into a monster job site. Draw them in to iterate the contextual job text link (MFA) system recursively… it doesn’t break any rules (if the initial job exists), and lets the seeker navigate the way nature intended.

Job placement recruiters get what, 6-10% of the first year’s salary at least?

So in 2007, where are the contextual ads for jobs?

Topical Tags:
★★ Click to Share!    Digg this     Create a del.icio.us Bookmark     Add to Newsvine
March 23rd, 2007 by john andrews

I expected Matt to be Smarter than That

Matt over at Wordpress posted “Selling Links” which goes across syndication with this starting paragraph:

“Let’s face it, we’re selling links here. Call it ‘buzz’ all you want, but it boils down to selling links. That skews Google’s index and they’ve come out against that quite publicly. If we’re all given the freedom to disclose in our own manner, we’re a moving target. If we’ve all got disclosure badges everywhere, it’s easy for them to penalize/ban us all.”

Of course if you click thru (and you would click thru), that’s not Matt speaking but a quote from the Pay Per Post blog that Matt subsequently criticizes (and he links to their blogger comments page, as if to avoid passing any link power to their blog. How sad. Hence the nofollow here). Matt Mullenweg, the guy who (desperately?) spammed Google with commercial doorway pages when he was down on his A-list luck (and got caught) is now taking a high-road and saying that link selling is bad for the web. Really?

Sorry, but I thought more of Matt than this. Sure, the conflict of interest is obvious (wordpress.com can’t monetize very well if a middleman like PPP is monetizing out from under him), but to brand it as a benevolent action is below where I expected Matt to be. The web is commercial. Matt is commercial. He’s good at it, and naturally he shouldn’t allow PPP on wordpress.com. (I also noticed an BlueHost advertisement on the PayperPost site that says “Move Off of Wordpress.com”, so clearly these guys are competitors). But really, Matt, to pander to the idealistic half of the audience so blatantly… you alienate the rest, you know? And you need the rest, don’t you? Maybe you don’t think you do. In my book, that’s a bad sign.

It wasn’t long ago we read about Matt Mullenweg spamming Google with 160,000 doorway pages on topics like “debt consolidation” and “asbestos”:

Mullenweg hosted at least 160,000 pieces of “content” on his site wordpress.org which use a cloaking technique to hide keywords such as “asbestos”, “debit consolidation” and “mortgages”. Mullenweg was paid a flat fee by Hot Nacho Inc., which creates software for search engine gamers to use. It’s been dubbed “Adsense bait” - Adsense is Google’s keyword-based classified advertising service…Mullenweg employed “negative positioning”, which uses a CSS directive to place the text offscreen, out of sight of the user, but where search engines can still read it….

This seems way hypocritical. I am reminded of the days when Steve Case and company spammed “the Internet” with AOL commercials. The Internet was non-commercial, and the idealists gathered with pitchforks to “defend the Internet” against commercialization. Who did more for the development of the Internet and web, AOL or those idealist defenders of usenet?  Painful question to answer, but the answer is the truth. Matt is painting the market makers as evil-doers, as Matt quietly protects his share of the market. Puh-lease.

 

Topical Tags:
★★ Click to Share!    Digg this     Create a del.icio.us Bookmark     Add to Newsvine
March 22nd, 2007 by john andrews

SEO for AJAX

Subtitled: “Meet my 5 sons: all named George” or “Advanced SEO

George Foreman has five sons, all named George. If you were to visit George and he were to introduce you to his sons, he would present each one in turn, call them by their name “George” and then maybe add a second label like (”my second George” or “the smart one” or whatever). He said he named them George because he wanted them to always know who their father was. They are not identical: they just have the same name. You can’t tell them apart by their names, but with the added second label or their looks, you could easily identify them. But not if you were blind. You’d need more info, right?

So how do you name your web pages?

Most webmasters know not to use frames because the frame set URL remains in place as each framed page is displayed. The result is like naming all of your kids George - they all have the same URL. A search engine will see just one “name” and list the one URL as a single page in the search index. Bad idea.

Most webmasters these days also know not to use a URL which is almost exactly the same for multiple pages, differing only by a long “id” value. If it isn’t sufficiently distinct to the search engine spider, it won’t be any different from The Other Page Also Named George. Most web masters are also keeping page titles distinct, as a second label (”the smart George”). More advanced web masters have learned to recognize overly “templatic” web pages, which can look nearly identical to other pages, and suffer a similar identity crisis to nearly blind search spiders (less severe consequences, but still sub-optimal).

But what about modern web applications deploying asynchronous user interface technologies like AJAX, where on-page content (or in-page content) can vary depending on context, not just URL? If the middle paragraph updates via asynchronous server calls, without a page refresh, then the URL hasn’t changed and the new resource (or “view”) becomes just another Web Page Named George. Sure it has different content, but it doesn’t exist as a separate entity according to the search engine labels (URL and page title). In other words, it won’t get indexed.

In the current search world, it is essential that each view full of content that you think defines a valuable, site-defining and user-facing page of content on your site be indexed as a unique web resource. If you collate content to create such views so they are specifically relevant for visitors in a specific context (such as we do when we optimize for SEO… matching views of content to referred search engine visitors), then your hard work collating will be for naught if it doesn’t get labeled and indexed.

So what to do?

Many SEO Advice web sites suggest that you generate a second set of static views (or dynamic, but URL-unique views) to be fed to search engines. I don’t think that is a very creative solution. The Sitemaps protocol is also a way to define views to reflect your pre-defined collections, yet archaic anti-cloaking guidelines (like the one at Google) require that those URLs deliver the same content to both users and search spiders. That makes the sitemap little more than a representation of static content. Again, that’s not very creative. For truly asynchronous, data-based content, it’s also not very practical. There are simply too many possible views.

If you are thinking this issue through analytically, and seeking a practical solution that enables deployment of asynchronous visitor views while simultaneously allowing a statically-labeled set of entry points to exist such that a traditional spidering of that content enables a contextual indexing of the content by labeled URL, go ahead and call yourself an SEO. If you get it done in a way that requires a substantial amount of extra work for a web designer, like maybe creating a second static web site just for search engines, then call yourself a second-tier SEO and try to increase your R&D time or training budget. You’re stuck in old-skool SEO. You’re going to need to update your skills sooner than you might like.

But if you have already mapped out your site’s user experience, and documented the intent and strategy that drives the UI designers to create the asynchronous interfaces deployed, then rest assured you are doing well. You understand the reasons why page sections update asynchronously, the underlying data structures in use behind the scenes, and the link between view and content. You already know how you can use that to define a set of defining views, to be sitemapped and exploited as landing pages. A second set of static pages? You betcha. A lot of extra work? on the contrary. It will probably be defining work for the marketing team, worthy of the effort even without search engines requiring it. It won’t require top-dollar designer and developer hours, but merely DHTML web designer hours. Once you get the basic infrastructure in place to support your “sitemap”, you can endeavor to optimize the site independent of the UI designers, the way nature intended :-)

Now if you find yourself working with a javascript team that has the patience required to document that interface, or one that actually had a strategy in place for the user experience before they built it, or one that is working with a data structure actually designed to support that strategically-defined user experience, consider yourself extremely lucky.

If not, there’s still hope. If you have a good marketing team, you might end up delivering some clue packages to the web too dot oh development primadonnas, along with a proper specification for how they might try and get it a little more right with the next refactoring. In order to keep the SEO bill less than twice the development bill (or less than the first year’s PPC bill). I’m just sayin’, thaz all.

Topical Tags:
★★ Click to Share!    Digg this     Create a del.icio.us Bookmark     Add to Newsvine
March 21st, 2007 by john andrews

SEO: Not Ready for Video

A bunch of SEO types got together to demonstrate the poor quality of video blogging in SEO land today. They were right.. varying audio levels, sync problems, generally horrible production values all around. One was obviously scripted; the other two obviously not scripted. Not sure what the plan is besides link bait for a slow news day, but I’m guessing from the comments it’s the start of a I-do-better-video-than-you meme. Cool. I look forward to round two.

Topical Tags:
★★ Click to Share!    Digg this     Create a del.icio.us Bookmark     Add to Newsvine
March 19th, 2007 by john andrews

SEO as International Minutia Dealer

I spend a lot of time on the minutia of web publishing. I also work for International clients. Since most of what I do is minutia for International clients, I recently referred to myself as an “International Minutia Dealer” and the poor lady next to me in the group acted shocked and said “I think we have enough war in the world, thank you very much”. Whatever.

So one of the minutia I deal with is the trailing slash. When dealing with frameworks, front controllers, and other-than-Apache web servers, it can be tough to get absolute control over trailing slash minutia. If you don’t know what I mean, consider this SEO quiz:

Q: How many web resources are represented by the following list of URLs, according to Google?

  1. http://www.example.com/
  2. http://www.example.com
  3. http://example.com/
  4. http://example.com
  5. www.example.com
  6. example.com
  7. www.example.com/
  8. example.com/
  9. www.example.com/index.html (or index.php or index.asp, your web server’s default)
  10. example.com/index.html
  11. http://www.example.com/index.html
  12. http://example.com/index.html
  13. http://www.example.com/index (your web server default is still index.html)
  14. http://www.example.com/index/
  15. http://example.com/index/
  16. http://example.com/index
  17. www.example.com/index/
  18. example.com/index

See what I mean about minutia? That’s 18 versions so far, and I left out some biggies. So what’s the answer? How many of those are unique URLs according to Google? And if we had another search engine today, what would the answer be for that search engine?

How about putting this into a little context. What if you have a page at www.example.com/index.html and it is very, very popular. It has 2 million back links to that exact URL, and 2 million more to www.example.com, half with a trailing slash (www.example.com/) and half without (www.example.com). Your boss implements a new site, and your job is to migrate the site to the new server. Oh, and the new server uses a front controlling system such that all urls will look like www.example.com/index/.  You have been reminded that “you were not hired for your web development or web design skillz, so stay out of that kitchen but because you know SEO make sure they don’t screw up and we don’t lose any rank, ok? It’s your only responsibility, so make sure it’s done right.” Nuff said.

That simple example of minutia drives an industry of highly-prized SEO consultants that work diligently while regular SEO “consultants” argue about whether SEO or PPC is the new sliced bread of marketing, or whether or not SEO is a Ron Popeil “set it and forget it” task.

Matt Cutts almost addressed this issue (from a “Google perspective”) last year on his blog. He was trying to clean up some of the mess around Google’s handling of redirects and Google’s use of the word “canonicalization”.

Those of us who suffered through courses on Linear Systems Control Theory in college know that a canonical form is an arrangement of a system such that you represent it with the least amount of parts (yet it is fully represented). Examples of canonicalization are everywhere, even if unlabeled. here’s another one:

The car has wheels and wheels have wheel covers. If you need to draw (represent) a car, you need to draw it with circles for wheels, because if you drew the car with no wheels people would not say it was a car, but would probably say “it looks like a car with no wheels”, or “it looks like a car that has no wheels”, etc. Draw the wheels and everyone will say “it’s a car”. Did you need to draw the wheel covers?  No. The canonical form of that representation (in this very specific example) could be the car body plus circles for wheels. Google is staffed by a bunch o’ engineers who have probably all taken control systems theory or higher math and so when they wanted to label the idea of “how do we identify the web site resource without all the extra redundancy that might be present in default file names, extensions, meaningless subdomains like www, and trailing slashes“, they probably started simplifying with lingo like “what’s the canonical root“. Geek talk. Of course my example is a physical one for the non-Engineers. Systems theory is not about physical parts like cars and wheels but mathematic equations and representations, which can be mixed and blended as needed to come up with different forms (such as canonical forms). There are actually many kinds of canonical forms. Go figure.

By the way “canonical” is also sometimes defined as ”according to the rules” (or canon), but since in this case there are no rules to follow, and the Google people were clearly trying to “figure this out” for a best way, I doubt that’s the source of the use.

Anyway Matt said this about the trailing slash:

Q: What is a canonical url? Do you have to use such a weird word, anyway?
A: Sorry that it’s a strange word; that’s what we call it around Google. Canonicalization is the process of picking the best url when there are several choices, and it usually refers to home pages. For example, most people would consider these the same urls:

    * www.example.com
    * example.com/
    * www.example.com/index.html
    * example.com/home.asp

But technically all of these urls are different. A web server could return completely different content for all the urls above. When Google “canonicalizes” a url, we try to pick the url that seems like the best representative from that set.

Q: So how do I make sure that Google picks the url that I want?
A: One thing that helps is to pick the url that you want and use that url consistently across your entire site. For example, don’t make half of your links go to http://example.com/ and the other half go to http://www.example.com/ . Instead, pick the url you prefer and always use that format for your internal links.

A commenter apparently is also a Minutia Dealer because he followed up with this good question:

Thanks Matt for the continued explanations and advice about this stuff. I have been reading up on Canonical issues for a while (suffering from one myself due to not knowing about them before hand and not using 301 protection), I have set up a 301 and on server name resolution so that all requests for the main index page go to www.theurl.com/ (the trailing slash is always added anyway).

Google still shows www.theurl.com/ and www.theurl.com/index.php in the serps and is docking my PR due to it. Will the 301 be picked up by the main googlebot and remove the index.php reference from the results in due course?

Also I can’t fathom out why this sort of thing isn’t under the webmaster’s control? If I know that the result www.theurl.com/index.php is WRONG then there should be a system to remove JUST that reference? Is this impossible?

As far as I know that follow up question remains unanswered, but that’s not surprising to me. Google has incorporated some automated “canonicalization” checkers which are pre-programmed to handle these minutia according to “the Google Algorithm”. Matt suggested that the examples are “technically different” but also says Google tries to pick the right ones to represent the web resource. More recently, Google people (was it Matt again?) have said that Google is “pretty good as that stuff” when discussing this very issue (I have to re-locate that reference.. don’t have it handy cause it just doesn’t matter to me). Of course they are. But that’s not the question. The question is, what does Google do?

The commenter followed the rules and got stuck - he’s got duplicate entries in the index for the same web resource, due to the way Google spidered and indexed his site. He can’t remove the bad one. He can’t fix the problem.

As a professional SEO (International Minutia Dealer) I want to exercise 100% absolute control over how Google spiders, indexes, and serves up my content. I don’t want to “try and see”, and I don’t want to “find out” on a live site. And when Google changes The Algo, I want Google to change it correctly, not just from TheOldGoogleWay to TheNewGoogleWay. I am all in favor of Google getting better over time, but very much against Google just getting “different” over time. I will strive to be TheBest, and get all of my minutia in a row, and I want Google to evolve into TheBest, reward my orderly, technically-correct minutia with error-free, predictable spidering, indexing, and serving in the SERPs. Is that too much to ask?

Of course I recognize the advances Google has made with Webmaster Console (cue Vanessa?). Yes I know there is now a “www or non-www” option in Google sitemaps. That’s not the answer, however. The questions are there even when there are only a few, relatively advanced people asking them. Those questions should be answered. It is not enough to answer the most basic ones once the majority of people are encountering them (think www vs. non-www).

So about that SEO Quiz… what’s the right answer? The first twelve are almost always the same resource, although they do not have to be. Does Google assume they are? Today? The next six are a bit odd, but today’s frameworks make them more common and less unusual. Are they unique according to Google today? Will they be tomorrow? Does anybody know? Can anybody know?

It seems Google figured out the domain fairly well, so issues of http:// or not are non-issues, but https:// and http:// are different as they should be *unless* you mix them yourself and then I suspect https:// is fair game for indexing even if there is a robots exclusion.  Google does a fair job of picking www or non-www from the way people link to you, the way you use it yourself  for internal linking, and your preferences if you use webmaster console (in that order if you ask me, in reverse order based on my read of Google representative’s suggestions). They still get it wrong sometimes. I still strongly recommend a hard 301 to your preferred default, and a very considerable eye on your in-linking.

What about the harder question of trailing slashes on deep resources? It’s a toss up. Keep in mind you wield some influence over Google by the way you self link and the way other’s link to you, but Matt Cutts said he thinks people commonly type in domains with trailing slashes so I still worry. I suspect there are more important issues on Google burners right now, and competitive SEO types will continue to build test sites and learn for themselves how TheGoogleAlgorithm works, for themselves and for their clients.

Topical Tags:
★★ Click to Share!    Digg this     Create a del.icio.us Bookmark     Add to Newsvine

Competitive Webmaster

★Get in early with Essociate.
☆ I like HuntingMoon Domains
★ Get listed in Aviva.
☆ This site hosted by Dreamhost
★ You might also try BlueHost for blog hosting.

Wonder how to be more competitive at some aspect of the web? Submit your thoughts.

SEO Secret

Not Post Secret

Click HERE



about


John Andrews is a mobile web professional and competitive search engine optimzer (SEO). He's been quietly earning top rank for websites since 1997. About John

navigation

blogroll

categories

comments policy

archives

credits

Recent Posts: ★ Where’s Bill Slawski when you Need Him? ★ How Much Does LinkedIn Pay You? ★ Starbucks WiFi No Worky… is ATT/SBC Throttling Users? ★ How to disable version tracking in Wordpress 2.6 ★ Good comment on community building ★ IDN: International Domaining ★ More Google Hubris from Amit Singhal ★ Good Mobile Ads Work ★ Is it Time to Block Flash for SEO Purposes? ★ Google Content Widgets, by Family Guy Guy ★ Competitive Web Publishing ★ Google: All You Need to Succeed ★ Research News: Old Boys Clubs breed more Old Boys ★ Firefox 3 : don’t download yet… ★ Doing Business with Verizon ★ Airline Domains: TAM Airlines doesn’t own TAM.com ★ Gas Price : Now $4.59 per gallon ★ Think Tank - for domainers and web entrepreneurs ★ Advanced SEO ★ iphone apple job iphone hype iphone video apple jobs hype ★ Temporary Post Used For Theme Detection (18***0a3-cf7a-40c3-8f4b-*****315ea - 3bfe001a-32de-4114-a6b4-4005b770f6d7) ★ Starbucks Losing Key Customers Over WiFi Glitches ★ Bravo! Google Maps 4 Mobile gets Bus/Train Info ★ Better Faster Cheaper — not the case with SEO ★ Less Trust for .info, .hk, .cn Top Level Domains 

Subscribe

☆ about

John Andrews is a mobile web professional and competitive search engine optimzer (SEO). He's been quietly earning top rank for websites since 1997. About John

☆ navigation

  • John Andrews and Competitive Webmastering
  • E-mail Contact Form
  • What does Creativity have to do with SEO?
  • How to Kill Someone Else's AdSense Account: 10 Steps
  • Invitation to Twitter Followers
  • ...unrelated: another good movie "Clean" with Maggie Cheung
  • ...unrelated: My Hundred Dollar Mouse
  • Competitive Thinking
  • Free SEO for NYPHP PHP Talk Members
  • Smart People
  • Disclosure Statement
  • Google Sponsored SPAM
  • Blog Post ideas
  • X-Cart SEO: How to SEO the X Cart Shopping Cart
  • IncrediBill.blogspot.com
  • the nastiest bloke in seo
  • Seattle Domainers Conference
  • Import large file into MySQL : use SOURCE command
  • Vanetine's Day Gift Ideas: Chocolate Fragrance!
  • ☆ blogroll

  • Healthcare Search Marketing
  • John Andrews
  • John Andrews SEO
  • MadHat
  • Mixminion
  • PrivateBloggingWiki
  • Privoxy
  • Reputation Mgmt Done Right
  • SEO Quiz
  • SMX Search Marketing Expo
  • Sustainable Living
  • T.R.A.F.F.I.C. East 2007
  • TOR
  • Vic
  • ☆ categories

    Competition (36)
    Competitive Intelligence (14)
    Competitive Webmastering (379)
    Webmasters to Watch (4)
    domainers (40)
    Oprah (1)
    Privacy (8)
    Public Relations (148)
    SEO (284)
    Client vs. SEO (2)
    Link Building (2)
    Search Engines vs. SEO (1)
    SEO SECRETS (9)
    SEO vs. SEO (1)
    ThreadWatch Watching (5)
    Silliness (22)
    society (6)
    Uncategorized (21)

    ☆ archives

  • July 2008
  • June 2008
  • May 2008
  • April 2008
  • March 2008
  • February 2008
  • January 2008
  • December 2007
  • November 2007
  • October 2007
  • September 2007
  • August 2007
  • July 2007
  • June 2007
  • May 2007
  • April 2007
  • March 2007
  • February 2007
  • January 2007
  • December 2006
  • November 2006
  • October 2006
  • September 2006
  • August 2006
  • July 2006