robots.txt

Discussions about SEO principles. Learn how to get better indexed.

Moderator: Moderators

robots.txt

Postby Therese » Thu Sep 17, 2009 12:50 am

How robots.txt works?
Therese
 
Posts: 31
Joined: Thu Sep 17, 2009 12:12 am

Advertisement

Re: robots.txt

Postby dcz » Fri Sep 25, 2009 4:06 pm

The robots.txt is just a text file with instructions that search engine crawlers may use for themselves.
The typical usage is to disallow some urls, for example :
Code: Select all
User-agent: *
Disallow: /file.php


Will ask SE not to index example.com/file.php. User-agent: * means all user agents.
Few other things could be done with robots.txt, but since SE do only follow these directives on a voluntary basis (major ones do or tend to, but some others do not at all), these should only be considered as hints, not as rules you could actually rely on.

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21379
Joined: Fri Apr 28, 2006 9:03 pm

Re: robots.txt

Postby airforce1 » Wed Nov 18, 2009 6:24 am

robots.txt is OK for guiding search engine crawlers behaviors but I think it might be a file could be used by hackers. :)
airforce1
 
Posts: 15
Joined: Wed Oct 14, 2009 2:25 pm

Re: robots.txt

Postby dcz » Sun Nov 22, 2009 3:18 pm

Used for what ?

I mean, of course it's a bit stupid if you for example disallow your back-office directory while it is password protected, because it's a way to tell the world its name, but it's not really going to help an hacker a lot. At least a hacker should not wait after this kind of hints to sneak in.

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21379
Joined: Fri Apr 28, 2006 9:03 pm

Re: robots.txt

Postby saurabh1985 » Wed Jan 20, 2010 12:21 pm

It's to maintain for 301 and 404 error.
saurabh1985
 
Posts: 4
Joined: Wed Jan 20, 2010 12:13 pm

Re: robots.txt

Postby seoserviceproviders » Thu Mar 11, 2010 8:34 pm

robot.txt file is mainly use to let the crawler now what to crawl and what not...when the search crawler comes to any site it searches the robot.txt and crawl the site accordingly.
seoserviceproviders
 
Posts: 2
Joined: Thu Mar 11, 2010 7:23 pm

Re: robots.txt

Postby Athlon101 » Fri Mar 12, 2010 3:25 pm

airforce1 wrote:robots.txt is OK for guiding search engine crawlers behaviors but I think it might be a file could be used by hackers. :)


Its main use now is to tell search engine spiders not to crawl certain pages or files on your site. It is only a txt file and as such is not an executable file so I cannot see that it would be of any interest to hackers.
Athlon101
 
Posts: 1
Joined: Wed Nov 28, 2007 10:56 pm

Re: robots.txt

Postby website-design-seo » Wed May 19, 2010 10:11 am

robots.txt is a text file which is specially used for web page.Google spider crawl over a site for little bit of sec,so within this period it is not possible to crawl all the pages of your site,so xml-site map is necessary for it. if we want spider is not crawl one page (the page which is not necessary)then in that case we can stop the spider not to crawl that page by using robots.txt file.
website-design-seo
 
Posts: 4
Joined: Tue Apr 13, 2010 1:09 pm

Re: robots.txt

Postby offsitenoc » Thu Aug 12, 2010 10:22 am

The concept of robots.txt is this: a robot wants to visit the site -http://www.musicmanias.0fees.net/. Before it does anything, it first looks for -http://www.musicmanias.0fees.net/robots.txt, to find out which pages it can index or not. If it can’t find the filename, it will go ahead and index everything on that directory.
offsitenoc
 
Posts: 23
Joined: Thu Aug 12, 2010 8:54 am

Re: robots.txt

Postby zohall » Sat Aug 14, 2010 4:48 pm

dcz wrote:The robots.txt is just a text file with instructions that search engine crawlers may use for themselves.
The typical usage is to disallow some urls, for example :
Code: Select all
User-agent: *
Disallow: /file.php



thank u dcz.. it was useful 4 me
zohall
 
Posts: 21
Joined: Sat Aug 14, 2010 4:43 pm
Location: www.zohall.com/

Re: robots.txt

Postby seoajay » Wed Sep 15, 2010 12:30 pm

By robots.txt we tell to Google that which page is allow for crawling or which not....
seoajay
 
Posts: 6
Joined: Wed Sep 15, 2010 12:19 pm

Re: robots.txt

Postby thezodiac » Mon Sep 20, 2010 4:00 pm

In a nutshell: Web site owners use the /robots.txt file to give instructions about their site to web robots; this is called The Robots Exclusion Protocol.
http://www.robotstxt.org/robotstxt.html
thezodiac
 
Posts: 34
Joined: Mon Sep 20, 2010 10:43 am


Return to SEO Principles

 


  • Related topics
    Replies
    Views
    Last post

Who is online

Users browsing this forum: No registered users and 2 guests