Correct use of Robots.txt
Firstly, what is the Robots.txt File
This is a small file hidden in your root directory called Robots.txt which is designed to give instructions to Search Engines Bots/Spiders – Bots being short for Robots that visit your site. The file tells the Bots which files / pages it may look at and spider and which ones it may not.
When the Search engines bots visit your site URL http://yourdomain.com/robots.txt. it will FIRST look into your root directory looking for a Robots.txt file. If it doesn't find one it will go about your site freely if it finds one it will seek instructions as to what files/pages of your website are accessible to spider and index. The file tells the robot (spider) which files it may spider (download). This system is called, The Robots Exclusion Standard.
The format for the robots.txt file is very special as it consists of records.
Each record consists of two fields: a User-agent line and one or more Disallow:
lines. The format is:
<Field> ":" <value>
The robots.txt file should be created in Unix line ender mode! Most good text editors will have a Unix mode or your FTP client *should* do the conversion for you. Do not attempt to use an HTML editor that does not specifically have a text mode to create a robots.txt file.
Some websites do not want to be spidered and indexed and therefore it is possible to instruct the bots to ignore pages or your URL all together.
It is of course vitally important that you know what you're doing when your using a Robots.txt file if you accidentally block the spiders from some or all of your pages you will never be indexed by the Major Search Engines. If in doubt don't use one at all or seek professional help or contact us alternatively you can visit www.robotstxt.org for further information.
Use of JavaScript |
Use of CSS |
Primary Keyword Layout
Pretty Sites v Spider Friendly Sites |
Correct use of Robots.txt File |
Dead Pages & 404 Not Found Errors |
Using Images for Primary Navigation
Correct Use of Home, Sitemap & Contact Page Links |
Use of Redirection Pages
Use of Small Type Fonts |
Should I use Hidden Doorway Pages?
Should I Use HTML Frames? |
Spider Friendly URL's
This article is copyright protected ©
White Hat Web Design have created the above article to help our web readers better understand the Internet and web applications, whilst we are happy for anyone to link back to this page, we will not accept any copying of its content or publication in any medium without written permission from White Hat Web Design.
Please link back using the URL http://www.white-hat-web-design.co.uk/articles/design-tips/robots-txt.php or contact us




