|
|
Crawler Friendliness
I couldn't think of anything new to write about
today, so I decided to
rerun the article I wrote last year on making sure your site is
crawler-friendly. I used to call this "search-engine-friendly" but my
friend Mike Grehan convinced me that the more accurate phrase was
"crawler-friendly" because it's the search engine crawlers (or
spiders) that your site needs to buddy-up to, as opposed to the search
engine itself.
So, how do you make sure your site is on good terms with the crawlers?
Well, it always helps to first buy it a few drinks. <grin> But, since
that's not usually possible, your next-best bet is to design your site
with the crawlers in mind. The search engine spiders are primitive
beings, and although they are constantly being improved, for best
results you should always choose simplicity over complexity.
What this means is that cutting-edge designs are generally not the
best way to go. Interestingly enough, your site visitors may agree.
Even though we SEO geeks have cable modems and DSL, our site visitors
probably don't. Slow-loading Flash sites, for example, may stop
visitors on dialup as well as the search engine spiders right in their
tracks. There's nothing of interest on the average Flash site to a
search engine spider anyway, so it's certainly not going to wait for
it to download!
Besides Flash, there are a number of "helpful" features being thrown
into site designs these days that can sadly be the kiss of death to
its overall spiderability. For instance, sites that require a session
ID to track visitors may never receive any visitors to begin with --
at least not from the search engines. If your site or shopping cart
requires session IDs, check Google right now to see if your pages are
indexed. (Do an inurl:yourdomainhere.com in Google's search box and
see what shows up.) If you see that Google has only 1 or 2 pages
indexed, your session IDs may be the culprit. There are workarounds
for this, as I have seen many sites that use session Ids get indexed;
however, the average programmer/designer may not even know this is a
problem.
Another source of grief toward getting your pages thoroughly crawled
is the use of the exact same Title tags on every page of your site.
This sometimes happens because of Webmaster laziness, but often it's
done because a default Title tag is automatically pulled up through a
content management system (CMS). If you have this problem it's well
worth taking the time to fix it.
Most CMSs have workarounds where you can add a unique Title tag as
opposed to pulling up the same one for each page. Usually the
programmers simply never realized it was important, so it was never
done. The cool thing is that with dynamically generated pages you can
often set your templates to pull a particular sentence from each page
and plug it into your Title field. A nice little "trick" is to make
sure each page has a headline at the top that is utilizing your most
important keyword phrases. Once you've got that, you can set your CMS
to retrieve it for your Title tags (with or without some variation).
Another reason I've seen for pages not being crawled is because they are set to require a cookie when a visitor gets to the page. Well,
guess what, folks? Spiders don't eat cookies! (Sure, they like beer,
but they hate cookies!) No, you don't have to remove your cookies to
get crawled. Just don't force-feed them to anyone and everyone. A
long as they're not required, your pages should be crawled just fine.
What about the use of JavaScript? We've often heard that JavaScript
is unfriendly to the crawlers. This is partly true, and partly false.
Nearly every site I look at these days uses some sort of JavaScript
within the code. It's certainly not bad in and of itself. As a rule
of thumb, if you're using JavaScript for mouseover effects and that
sort of thing, just check to make sure that the HTML code for the
links also uses the traditional <a href> tag. As long as that's
there, you'll most likely be fine. For extra insurance, you can place
any JavaScript links into the <noscript> tag, put text links at the
bottom of your pages, and create a visible link to a sitemap page
which contains links to all your other important pages. It'
definitely not overkill to do *all* of those things!
There are plenty more things you can worry about where your site's
crawlability is concerned, but those are the main ones I've been
seeing lately. One day, I'm sure that any type of page under the sun
will be crawler-friendly, but for now, we've still gotta give our
little arachnid friends some help.
One tool I use to help me view any potential crawler problems is the
Lynx browser tool that can be found here on the delorie.com site:
<http://www.delorie.com/web/lynxview.html>.
Generally, if your pages
can be viewed and clicked through in a Lynx browser (which came before
our graphical browsers of today), then a search engine spider should
also be able to make its way around. That isn't written in stone, but
it's one way of discovering potential problems that you may be having.
The delorie.com site has a search engine spider simulator tool, which
is also helpful in discovering potential spidering problems.
If you think your site isn't getting spidered completely, be sure to
check out lots of things before jumping to any conclusions.
Jill
Jill Whalen of High Rankings is an internationally recognized
search engine optimization consultant and host of the free weekly High Rankings Advisor
search engine marketing newsletter.
She specializes in search engine optimization, SEO consultations and seminars. Jill's handbook,
"The Nitty-gritty of Writing for the Search Engines"
teaches business owners how and where to place relevant keyword phrases on their Web sites so that they make
sense to users and gain high rankings in the major search engines.
Contact WingsDove for affordable and effective small business web design and
web site optimization.
|
|
|
|
|