check out great post on Robots.txt role in seo at webconfs:
--- Free Website Builder : Create a Free Website for Small Business | WebStartToday.com
This file is basically used for disallow unnecessary pages.
That no-indexed webpages do not find yourself within the catalog however banned webpages do, and therefore the latter will appear within the google look for (albeit with less details since the robots cannot see the site content). it's by no indicates that necessary for google however generally google modify what they're requested to not do.
robots.txt file should be uploaded to your website rood directory and a valid sitemap.xml reference should be detected from it. This file is useful to search engine to know what pages are allowed to be crawled and what are not. Though it looks very small file it is very valuable for SEO.
Robots.txt contains 2 important information for search engines(Google etc.)
1) Which pages/areas of site to crawl and index as well as which all pages it should not index and download in their cache, so it can be used for SEO by disallowing indexing of duplicate or dynamic generated urls(with parameters etc), private areas of site which you don't want to appear in search result pages.
2) Location of sitemap: Robots.txt contains a line that informs search engine about location of sitemap xml or gz files, it will be helpful if site is large and multiple sitemaps are getting used.
Robot.txt is an on-page SEO technique and it is basically used to allow for the web robots also known as the web wanderers, crawlers or spiders. It is a program that traverses the website automatically and this helps the popular search engine like Google to index the website and its content.
Robots.txt is a text file you set on your site to tell search robots which webpages you would like them not to visit. It helps us to avoid the duplicate links in a website. We clearly mention the links which we want to follow or no follow.
Robots.txt is a text (not html) file you put on your site to tell search robots which pages you would like them not to visit. Robots.txt is by no means mandatory for search engines but generally search engines obey what they are asked not to do. It is important to clarify that robots.txt is not a way from preventing search engines from crawling your site (i.e. it is not a firewall, or a kind of password protection) and the fact that you put a robots.
When a search engine crawler comes to your site, it will look for a special file on your site. That file is called robots.txt and it tells the search engine spider, which Web pages of your site should be indexed and which Web pages should be ignored.
the simplest robots.txt file uses two rules:
These two lines are considered a single entry in the file. You can include as many entries as you want. You can include multiple Disallow lines and multiple user-agents in one entry.
The main use of robot.txt is for the page which you don't want to show to user like admin panel or any 404 page of your website.
The files or directories you do not want to be indexed by search engines, you can use a robots.txt file. It is to tell where the robots should not go. These files are simple text files that are placed on the web server. They must be placed on the root folder like this www.site.com/robots.txt
And this is how you write:
User agent: *
Robots.txt permits you to specify that pages shouldn't be crawled. Pages that do not get crawled will still rank for keywords and show up in search results. Robots.txt has been with U.S. for over fourteen years, however what percentage people knew that additionally to the require directive there is a no index directive that Google bot obeys? That no-indexed pages do not find yourself within the index however disallowed pages do, and therefore the latter will show up within the search results (albeit with less info since the spiders cannot see the page content). it's by no suggests that necessary for search engines however typically search engines adjust what they're asked to not do.
Robot.txt comprises the instruction for the search-engine bots. It has code which instruct search engine bots or crawler about the crawling or indexing site page. It tells bots whether which urls of site it follow or which one not.