I need Robots.txt to block spiders from root directory

GoogleBot, MSNBot, Yahoo!Slurp ... Everything about indexing Bots, ip lists, User Agents, Crawl and robots.txt.

Moderator: Moderators


I need Robots.txt to block spiders from root directory

Postby daddyo » Thu Oct 25, 2007 6:11 pm

Currently, my robots file is this:

User-agent: *
Disallow: /phpBB/posting.php
Disallow: /phpBB/groupcp.php
Disallow: /phpBB/search.php
Disallow: /phpBB/login.php
Disallow: /phpBB/privmsg.php
Disallow: /phpBB/post
Disallow: /phpBB/member
Disallow: /phpBB/profile.php
Disallow: /phpBB/memberlist.php
Disallow: /phpBB/faq.php

I have this in my root web folder. I would also like to block spiders from crawling in my root folder as well, but since I'm unfamiliar with the syntax, I don't want to create somewthing that will block my forum URLs as well.

How do I use Disallow to disallow access to my root folder, but still keep the same rules as above?
daddyo
 
Posts: 47
Joined: Wed Sep 12, 2007 6:05 pm

Advertisement

Postby dcz » Sun Oct 28, 2007 12:42 pm

The problem is, if you really disallow the root folder, you'll disallow everything.

A disallow on a folder will be active on all pages in the folder, it's the same for the root folder.

If you want to disallow a type of page in the root folder, you can use :
Code: Select all
User-agent: *
Disallow: /page_disallowed.html
Disallow: /phpBB/posting.php
...


Disallowed page won't show up in SERPs, so take care. We only want to disallow the "bad" pages.

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 19911
Joined: Fri Apr 28, 2006 9:03 pm

Postby daddyo » Sun Oct 28, 2007 2:10 pm

Thanks.

The only problem is that in order to get individual pages to not be indexed, I have to list them one-by-one in the robots.txt, thus exposing my files to anyone looking at my robots.txt file.

I guess I need to put them in their own subdirectory.
daddyo
 
Posts: 47
Joined: Wed Sep 12, 2007 6:05 pm

Postby dcz » Sun Oct 28, 2007 2:33 pm

The sub directory is a good idea.

You can as well ad the noindex tag in the meta tags. It's pretty much the same effect as the robots.txt, but performed on individual pages.

something like :

Code: Select all
<meta name="robots" content="noindex,nofollow">


To be sure the pages will not show up in SERPS.

More about robot meta tag : http://www.robotstxt.org/wc/meta-user.html

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 19911
Joined: Fri Apr 28, 2006 9:03 pm

Postby daddyo » Sun Oct 28, 2007 2:37 pm

Thanks.

I think for now I'll try the meta tag approach, since I have lots of scripts that depends on other scripts and it might take too long to track down dependencies if I moved them all to a subfolder.
daddyo
 
Posts: 47
Joined: Wed Sep 12, 2007 6:05 pm


Return to roBots

 


  • Related topics
    Replies
    Views
    Last post

Who is online

Users browsing this forum: No registered users and 3 guests


 
cron