Skip to main content

Google Guideline - How spiders view your site

In its "Guidelines for Webmasters" document Google notes that "search engine spiders see your site much as Lynx would".

A web spider is a program that searches through the internet for content (see here for more definitions.)

Lynx is a web browser that was used in the good old days of internet before we had fancy things like mouses, graphics, or sliced bread. Put very simply Lynx is a bareboned web browser that supports a minimal set of features. You can download a free copy from this website. There are other uses for Lynx other than SEO (such as pinging a webpage in a crontab), but for SEO it is mainly used for usability and visibility testing.

If you don't feel like installing new software there are a number of online spider emulators that will try to show you how a spider views your website. One that I found is available here.

Now that we have the means to see how Google spiders view our website we can have a look at what implications the guideline has for our site.

Firstly we need to realize that search spiders "crawl" through your site by following links. Obviously if a spider is unable to read a link then it won't find the page the link points to.

Certain technologies like can make links invisible to spiders. Google can now index text from Flash files and supports common Javascript methods. They don't currently support Microsoft Silverlight so you should avoid using it (it's probably a good idea to steer away from Microsoft proprietory formats anyway no matter how much crazy monkey man screams "developers!" and sweats in his blue shirt).

Google maintains an easy-to-read list of technologies that it supports. You can find it online here.

View your site in a spider emulator or Lynx and make sure that you can navigate through the links. If you can't then there is a good chance that Google can't either.

One way to nudge spiders along is to provide a sitemap. This also helps your human readers. Remember that Google does not like you to have more than 100 links on a page so if you have a large site try to identify key pages rather than providing an exhaustive list.

Some people argue that if you need a sitemap then your navigation system is flawed. Think about it - if your user can't get to content quickly through your navigation system then how good is your site at providing meaningful content? Personally I like to balance this out and provide sitemaps as an additional "bonus" while still ensuring that all my content is within 2 clicks of the landing page.

Comments

Popular posts from this blog

Separating business logic from persistence layer in Laravel

There are several reasons to separate business logic from your persistence layer.  Perhaps the biggest advantage is that the parts of your application which are unique are not coupled to how data are persisted.  This makes the code easier to port and maintain. I'm going to use Doctrine to replace the Eloquent ORM in Laravel.  A thorough comparison of the patterns is available  here . By using Doctrine I am also hoping to mitigate the risk of a major version upgrade on the underlying framework.  It can be expected for the ORM to change between major versions of a framework and upgrading to a new release can be quite costly. Another advantage to this approach is to limit the access that objects have to the database.  Unless a developer is aware of the business rules in place on an Eloquent model there is a chance they will mistakenly ignore them by calling the ActiveRecord save method directly. I'm not implementing the repository pattern in all its ...

"Word of the Day" PHP script (with word list)

I was looking around for a way to generate a word of the day on the web and didn't find anything. So I coded a quick and dirty script to do it. Just in case anybody does a Google search and manages to find my blog: here is my Word of the Day PHP script : Copy this code snippet into a wordoftheday.php file: $file = fopen("interesting_words.txt","r"); $raw_string = fread($file,filesize("interesting_words.txt")); fclose($file); $words_array = explode("|",$raw_string); echo $words_array[array_rand($words_array)]; Of course the real issue I had was finding a list of interesting words in the right format. Here is the list of interesting words that I used: Copy this into a file called interesting_words.txt : ubiquitous : being or seeming to be everywhere at the same time; omnipresent| ecdysiast : a striptease artist| eleemosynary : of, relating to, or dependent on charity| gregious : c...

Solving Doctrine - A new entity was found through the relationship

There are so many different problems that people have with the Doctrine error message: exception 'Doctrine\ORM\ORMInvalidArgumentException' with message 'A new entity was found through the relationship 'App\Lib\Domain\Datalayer\UnicodeLookups#lookupStatus' that was not configured to cascade persist operations for entity: Searching through the various online sources was a bit of a nightmare.  The best documentation I found was at  http://www.krueckeberg.org/  where there were a number of clearly explained examples of various associations. More useful information about association ownership was in the Doctrine manual , but I found a more succinct explanation in the answer to this question on StackOverflow . Now I understood better about associations and ownership and was able to identify exactly what sort I was using and the syntax that was required. I was implementing a uni-directional many to one relationship, which is supposedly one of the most simpl...