A robots.txt file tells search engines whether they can access and therefore crawl parts of your site. This file, which must be named robots.txt, is placed in the root directory of your site.
The address of our robots.txt file
All compliant search engine bots (denoted by the wildcard * symbol) shouldn't access and crawl the content under /images/ or any URL whose path begins with /search
You may not want certain pages of your site crawled because they might not be useful to users if found in a search engine's search results. If you do want to prevent search engines from crawling your pages, Google Webmaster Tools has a friendly robots.txt generator to help you create this file.
Note that if your site uses subdomains and you wish to have certain pages not crawled on a particularsubdomain, you'll have to create a separate robots.txt file for that subdomain. For more informationon robots.txt, we suggest this Webmaster Help Center guide on using robots.txt files.
There are a handful of other ways to prevent content appearing in search results, such as adding
"NOINDEX" to your robots meta tag, using .htaccess to password protect directories, and using
Google Webmaster Tools to remove content that has already been crawled. Google engineer Matt Cutts walks through the caveats of each URL blocking method in a helpful video.
Good practices for robots.txt
Use more secure methods for sensitive content - You shouldn't feel comfortable using robots.txt to block sensitive or confidential material. One reason is that search engines could still reference the URLs you block (showing just the URL, no title or snippet) if there happen to be links to those URLs somewhere on the Internet (like referrer logs). Also, non-compliant or rogue search engines that don't acknowledge the Robots Exclusion Standard could disobey the instructions of your robots.txt. Finally, a curious user could examine the directories or subdirectories in your robots.txt file and guess the URL of the content that you don't want seen. Encrypting the content or password-protecting it with .htaccess are more
secure alternatives.
Avoid:
Tuesday, January 6, 2009
Make effective use of robots.txt
Subscribe to:
Post Comments (Atom)
Blog Archive
-
▼
2009
(36)
-
▼
January
(23)
- SEO at its Core
- SEO's Evolution
- Create Read More in Blogger
- Showing how's Off / Online Yahoo Messenger Widget
- Change the icon on the addres bar
- Widget install in a corner of the browser
- How to add a comment box in blog
- About Alexa
- Take advantage of web analytics services
- Use heading tags appropriately
- Understanding Search Engines
- Good practices for images
- Offer quality content and services
- Make use of the description meta tag
- Does Your SEO Content Sell
- Make effective use of robots.txt
- Write better anchor text
- Home Page About SEO
- Google Index
- Make your site easier to navigate
- Good practices for URL structure
- Create unique, accurate page titles
- Search Enggine Submit Express
-
▼
January
(23)
0 comments:
:)) :)] ;)) ;;) :D ;) :p :(( :) :( :X =(( :-o :-/ :-* :| 8-} ~x( :-t b-( :-L x( =))
Post a Comment