Naughty google bot please advise

GoogleBot, MSNBot, Yahoo!Slurp ... Everything about indexing Bots, ip lists, User Agents, Crawl and robots.txt.

Moderator: Moderators

Naughty google bot please advise

Postby Jimcanswim » Mon Jan 21, 2008 9:36 pm

Hi all, hope you can help. Google is still looking at areas of my forum that I do not want it to waste it's time on, for example it still views private messages and profile

Below is my robots txt file. Any advice appreciated.

Thanks

User-agent: *
Disallow: /forum/private/
Disallow: /forum/admin/
Disallow: /forum/db/
Disallow: /forum/docs/
Disallow: /forum/cache/
Disallow: /forum/cgi-bin/
Disallow: /forum/files/
Disallow: /forum/images/
Disallow: /forum/includes/
Disallow: /forum/language/
Disallow: /forum/tmp/
Disallow: /forum/templates/
Disallow: /forum/common.php
Disallow: /forum/memberlist.php
Disallow: /forum/memberlist.php?
Disallow: /forum/config.php
Disallow: /forum/faq.php
Disallow: /forum/groupcp.php
Disallow: /forum/login.php
Disallow: /forum/modcp.php
Disallow: /forum/profile.php?mode=viewprofile&u=
Disallow: /forum/profile.php?mode=editprofile
Disallow: /forum/profile.php?mode=email&u=
Disallow: /forum/printview.php
Disallow: /forum/privmsg.php
Disallow: /forum/privmsg.php?
Disallow: /forum/ranks.php
Disallow: /forum/search.php
Disallow: /forum/viewonline.php
Disallow: /forum/profile.php
Disallow: /forum/profile.php?
Disallow: /forum/posting.php?
Disallow: /forum/posting.php
Jimcanswim
 
Posts: 7
Joined: Mon Jan 21, 2008 9:33 pm

Advertisement

Postby TomaS » Mon Jan 21, 2008 10:56 pm

you dont have to disallow all
just put there this
Code: Select all
User-agent: *
Disallow: /forum/viewtopic.php?
Disallow: /forum/viewforum.php?
Disallow: /forum/index.php?
Disallow: /forum/posting.php?
Disallow: /forum/groupcp.php
Disallow: /forum/profile.php?
Disallow: /forum/memberlist.php
Disallow: /forum/search.php?
Disallow: /forum/login.php
Disallow: /forum/faq.php
phpBB podpora-slovak phpBB support
slovenský preklad pre phpbb3-slovak translate for phpBB3
predaj a kupa domeny, marketing a internetove podnikanie- Slovak domain center
TomaS
PR2
PR2
 
Posts: 229
Joined: Fri Jun 08, 2007 1:22 am

Postby Jimcanswim » Tue Jan 22, 2008 12:58 am

Thanks for the reply, will the viewtopic, search or viewforum ban stop google from indexing posts though?
Jimcanswim
 
Posts: 7
Joined: Mon Jan 21, 2008 9:33 pm

Postby SeO » Tue Jan 22, 2008 8:33 am

if you are not url rewriting yes, this robots.txt will prevent all the natural link indexing in phpBB forums.
SeO
Admin
Admin
 
Posts: 6333
Joined: Wed Mar 15, 2006 9:41 pm

Postby Jimcanswim » Tue Jan 22, 2008 11:28 am

Thanks guys can anyone help me write a simple robots.txt that simply stops google looking at private messages and viewing profile etc? I want it to index all my posts.

Thanks
Jimcanswim
 
Posts: 7
Joined: Mon Jan 21, 2008 9:33 pm

Postby TomaS » Tue Jan 22, 2008 10:21 pm

i thins Pm can bot see in default and how can he does it, he can post anything, and profile is /forum/profile.php
phpBB podpora-slovak phpBB support
slovenský preklad pre phpbb3-slovak translate for phpBB3
predaj a kupa domeny, marketing a internetove podnikanie- Slovak domain center
TomaS
PR2
PR2
 
Posts: 229
Joined: Fri Jun 08, 2007 1:22 am

Postby dcz » Thu Jan 24, 2008 1:33 pm

Jimcanswim wrote:Thanks guys can anyone help me write a simple robots.txt that simply stops google looking at private messages and viewing profile etc? I want it to index all my posts.

Thanks


Code: Select all
User-agent: *
Disallow: /phpbb/posting.php
Disallow: /phpbb/groupcp.php
Disallow: /phpbb/profile.php
Disallow: /phpbb/memberlist.php
Disallow: /phpbb/search.php?
Disallow: /phpbb/login.php
Disallow: /phpbb/faq.php


will disallow profile, memberlist, group pages, search result pages, login page and faq.

Replace "phpbb/" with your actual phpBB path if any, or with nothing in case phpBB is installed in the domain's root.


++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21219
Joined: Fri Apr 28, 2006 9:03 pm

Postby Jimcanswim » Fri Jan 25, 2008 5:10 am

Thanks dcz, I'll give it a go.
Jimcanswim
 
Posts: 7
Joined: Mon Jan 21, 2008 9:33 pm

Postby Jimcanswim » Fri Jan 25, 2008 8:55 am

DCZ, I implemented your change but the bot is still looking at profile etc. Whats going on?

Any Ideas?

Cheers
Jimcanswim
 
Posts: 7
Joined: Mon Jan 21, 2008 9:33 pm

Postby TomaS » Fri Jan 25, 2008 10:33 am

and how you know it you see any new indexed profiles?
o dont think so you see only in who is online :o
phpBB podpora-slovak phpBB support
slovenský preklad pre phpbb3-slovak translate for phpBB3
predaj a kupa domeny, marketing a internetove podnikanie- Slovak domain center
TomaS
PR2
PR2
 
Posts: 229
Joined: Fri Jun 08, 2007 1:22 am

Postby dcz » Sun Jan 27, 2008 9:41 am

Jimcanswim wrote:DCZ, I implemented your change but the bot is still looking at profile etc. Whats going on?

Any Ideas?

Cheers


Well, a new disallow in the robots.txt could need years to be fully taken into account by bots. Google should be faster than other, but it depends, it can take a long time especially if the pages previously weren't disallowed and already indexed.

They should though not be cached again, new profiles should not be cached as well.

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21219
Joined: Fri Apr 28, 2006 9:03 pm

Postby Jimcanswim » Sun Jan 27, 2008 9:44 am

Thanks for the replies guys. When I go to my admin area I see the google bot ip address still looking at profile etc. That's how I know it is still looking at the files, despite the robots.txt file.

Cheers
Jimcanswim
 
Posts: 7
Joined: Mon Jan 21, 2008 9:33 pm

Postby TomaS » Sun Jan 27, 2008 1:31 pm

dont trust that who is online and where is it.Wait some time and you will see the results
phpBB podpora-slovak phpBB support
slovenský preklad pre phpbb3-slovak translate for phpBB3
predaj a kupa domeny, marketing a internetove podnikanie- Slovak domain center
TomaS
PR2
PR2
 
Posts: 229
Joined: Fri Jun 08, 2007 1:22 am

Postby HB » Thu Jan 31, 2008 3:05 am

dcz wrote:Well, a new disallow in the robots.txt could need years to be fully taken into account by bots. Google should be faster than other, but it depends, it can take a long time especially if the pages previously weren't disallowed and already indexed.

If bots traversing old profile links really bothers you, remap them to something else and redirect the "old" URLs to your forum's index. I've noticed that search engines take an ETERNITY to forget about a URL once discovered, even if you return "410", but they'll take into account 301 "permanently moved" return code in a couple weeks.
Dan Kehn
HB
phpBB SEO Team
phpBB SEO Team
 
Posts: 1211
Joined: Mon Oct 16, 2006 2:25 am

Postby Pigeon » Thu Jan 29, 2009 9:18 pm

I've added an auth check to memberlist.php, groupcp.php and usercp_viewprofile.php on my forum so that these pages can only be viewed by users who are registered and logged in. I did this mainly because I don't want humans looking at those pages unless they're members, but partly also because I too had noticed that Google seems very slow to adjust its behaviour to correspond to new modifications to robots.txt. It might be a bit of a sledgehammer solution but you could do the same thing only with an IP check set to exclude googlebot's IPs instead of an auth check.
Pigeon
 
Posts: 17
Joined: Sun Jan 18, 2009 10:36 pm

Next

Return to roBots

 


  • Related topics
    Replies
    Views
    Last post

Who is online

Users browsing this forum: No registered users and 3 guests