BizTalk Utilities CV ,   Jobs ,   Code library
 
Home Page


Add/Edit your code items
Search the code library
Browse for the code library


SEO
SEO - On page factors that influence your ranking
SEO - 36 tips for your dev team to design websites


 
 

<< Schemas, xsd, xdrSQL XML >>
 


By Mark Wilson
I am the creator of TopXML. I am available for international and local (Australia) contracts. I am a Solution Architect/Business Analyst. I have worked in IT in several countries (NZ, Australia, South Africa, UK) building and training teams for government and very large non-governmental organizations. I am ex-Microsoft Consulting Services. I wrote the first book on Microsoft XML published in 2000 called XML Programming with VB and ASP. Most recently I have been building tools for the SEO industry. Ask me for a 37 point SEO health-checkup for your website.
First Posted 04/10/2008
Times viewed 486

SEO - How does a search engine view your website page?


This post covers the way in which a search engine views your page, how they may evaluate it and how you can assist them in correctly understanding your content and its value.

NOTE: Read the do's and dont's of SEO according to Google.

Ask yourself two questions. Are the various search engines fully indexing all your available content? And if they are, if your excellent content is not ranking well - then do you think the engine is able to comprehend your content and evaluate its value properly?

There are some simple things that can stop a full index or stop the engine from understanding your content. That is the goal of this article - to explain some of those factors (in plain english) to you.

These are just a few of the factors that are contained in my free "36 SEO design secrets PDF"
36_seo_design_secrets.pdf 

Is your entire site indexed?

If you content can't be found, it won't be indexed. If it's not indexed it can't be read/understood by the engine - and it won't rank. You most likely have a good idea of how much content you have. To find out how much of it is indexed, use this tool SEO Chat - Indexed Pages. If the engines have fully indexed you, then you can pat yourself on the back - you website is clearly navigable by an engine spider. But don't stop here, read on...

Does Google understand your content?

Do you have excellent content which doesn't rank well? Then the problem may be that the engines don't understand your content.

Things that block a full index

Some customers design for humans only, they don't design for search engines. So they use CSS, Ajax, DHTML, Javascript and images extensively... they get a fabulous looking website which is impossible for Google or other search engines to understand.

Have a look at how your website appears to a search engine. Use this tool: SEO Chat - Spider Simulator. Lets look at the results for topxml.com default front page:

There are two very important things to understand:

  1. Engines see every page as text
  2. and as links (internal, external/outbound and inbound)

This means that if your content is not visible as text (or as title/alt tags) then it won't be understood. If you links are inside a javascript file, then your link can't be followed and your content won't be discovered.

I have seen one local company use a single image as their default front page. No text. And of course the result was I couldn't find them in google. I just checked, they have removed the single image and replaced it with a swf file. In the view source of their page they still have absolutely no words that give Google a hint as to what this page and the website is all about.

But it's not all bad news. JS, CSS, SWF are all text files and can be read by engines. But the problem remains, if the engine cannot figure out how to navigate the information that is presented, then it can't - and it won't.

Menus can be ok for both humans and engines. I have seen menus that are not DHTML/js files and are highly suitable for engines to read. The best example of a good one was one I saw last month. The menu was in the page as plain text and the animation and sexiness was provided by the CSS file. This is clever, because if you remember the results of our spider simulation, the page will be seen as text and as links... this design is ideal for that. The engine sees the text and the url right there in the page and the animation stuff is kept in css, externally.

Shortcut to get a full index

If you don't have the time and money to re-engineer your entire website around folders/categories and removing js files and making navigation suitable for the engines - then there is one quick way to ensure a full index. Build a sitemap. (Yahoo countered with a URLList at the time but it didn't scale. Yahoo now says it supports sitemaps). What is a sitemap? It's a single XML file (children files are encouraged for massive sites) built according to the sitemap protocol. You simply output a complete list of all the pages you have on your website and the engine reads each one and indexes them.

Ensuring an engine correctly understands your content

They key to good design is to learn to think like an engine. Many people design so that users will enjoy the experience of visiting your site, but do you design so that a search engine will easily comprefend what it is seeing?

With the spider simulator - above - we saw that an engine will see the text (including alt/title tags and anchor tags) and it will see the urls. When it visits a new page it has no idea what is on that page. It will use some criteria to determine the category of the content

  • text on the page
  • text on pages linking to this page
  • text on pages this page links to
  • alt and title text on the images of this page
  • anchor tags/text on the links on this page
  • anchor tags/text on the links on other pages that point to this page
  • text in the html head section of this page

Are you getting the feeling it is all about text and links? If so, then you are getting the most important issue. The engine will use the available information to try and determine what your page is all about. If it is confused about the topic of the page, then the page will rank poorly. If it can determine to a high degree that the page contains exact and high quality information, then your page will rank highly.

One small note... if other highly trusted pages link to your page, then it is assumed that you are high quality too. This is based around trust logic: a friend of a friend is my friend too. In other words if I trust you, then I will extend trust to anyone you like and respect. I automatically give them some of my trust, because you do. (This criteria used to be publically called pagerank and it was endlessly gamed by people, but now its internal to Google and its rumoured to be called Trustrank.)

Themes (or topics)

So it's important that the topic of your website is clearly indicated by the content. To test this for your website, use this tool: SEO Chat - Keyword Cloud. The larger the word, the more the page "looks" like that word. Of course this keyword cloud tool only evaluates the words, it ignores many other concepts that the engines will look for.

Look at the keyword cloud for the default topxml.com page

You can see that it's a mish mash of related topics. The two big ones that stand out are BizTalk and XML. Thats pretty good news for us, because we sell a BizTalk product and our main focus is on all things XML. So that's a good job!

One of the main reasons this page is so unfocussed is that it points to every content area (or topic) on the website. Therefore every topic shows up in this cloud.

The cloud for a content page about PHP XML should look like it's all about PHP XML. And the mistake that some people make it that all their pages point to all their other pages. In this way all their pages look the same and they are not distinct, no page looks better than another. The engine gets confused... is this page about XML or about PHP XML? And is that page about .Net XML or about BizTalk?

In the same way that Microsoft Windows Vista tries to group all your files into separate folders, you should try to ensure that a PHP XML page links to other PHP XML pages. Grouping them means that the text and links on the page are consistent and easily make sense to a search engine.

Group your content

How does Google (in particular) determine a topic or theme? It expects to find similar words grouped on the same page. What words are those? You can use this tool to find out: Google Sets and also use this one: Google Suggest for additional information. Many people also install Google Adsense and then use the keyword finder to discover what keywords Google expects to see.

What's the value of this information? If you know Google expects to see various information together, you can group suitable content together. By doing this you ensure that Google see and understands your content.

The bottom line: If the engine can find your content and can understand it, you have a much higher likelihood of it endexing you highly. It can only use text and links to figure that out. Help it by having good design!

In a future post I will explain more about how navigation can dramatically help bring a page to the attention of a search engine by evenly distributing the "voting power" (as I like to call it) of the pages on your site.


Rate this article on a scale of 1 to 10 (1 votes, average 9)

Your vote :  

<< Schemas, xsd, xdrSQL XML >>
 





Leave a comment for this article
Your name
Your email (optional)
Your comment
Optional: Upload an attachment
Enter the code shown:

 
 

    Email TopXML