phpBB SEO
Boards
Directory  
SEO  
Downloads
  phpBB SEO : Search Engine Optimization, Directory, Forums  
Index
Forums
Annuaire
Référencement
Télécharger
 
  Search Rechercher
    Register
Username :  Password :  Log me on automatically each visit  
S'enregistrer  
 
   
URLs restricted by robots.txt

 
Post new topic   Reply to topic    phpBB SEO » SEO Forum  » Google Forums
::  
Author Message
Silverado05
PR0
PR0


Joined: 01 Jul 2006
Posts: 51
Location: Texas

URLs restricted by robots.txtPosted: Sat Aug 26, 2006 9:32 am    Post subject: URLs restricted by robots.txt

Ok, I was going through my google sitemaps account checkout the latest data. While I have over 11,000 pages indexed and they seem to be vauable content It notfied me that I have (1107) URLs restricted by robots.txt. Now I was scanning the URLs it restricted and some are legit like member profiles, etc. But I also noticed it is restricting the URL's of acutally posts I.E.

http://www.texascampingforum.com/forum/post358.html

So wouldn't that need to be unresctriced as that is vaulable content? Thier is several 100 URL's like that, that go to post.

Here is the robot text:

Code:
User-agent: *
Disallow: /administrator/
Disallow: /cache/
Disallow: /components/
Disallow: /editor/
Disallow: /help/
Disallow: /images/
Disallow: /includes/
Disallow: /language/
Disallow: /mambots/
Disallow: /media/
Disallow: /modules/
Disallow: /templates/
Disallow: /installation/

Disallow: /forum/viewtopic.php
Disallow: /forum/viewforum.php
Disallow: /forum/index.php?
Disallow: /forum/posting.php
Disallow: /forum/groupcp.php
Disallow: /forum/search.php
Disallow: /forum/login.php
Disallow: /forum/post
Disallow: /forum/member
Disallow: /forum/profile.php
Disallow: /forum/memberlist.php
Disallow: /forum/faq.php



So would you recommend removing Disallow: /forum/post and/or Disallow: /forum/viewtopic.php or will that just cause google to index junk?

Also I might add it said "We can't currently access your home page because of a robots.txt restriction." but yet I am getting indexed and the line before that says this, " Googlebot has successfully accessed your home page. Last crawl date: Dec 31, 1969". Any ideas as to why, FYI my home page is Joomla CMS which links to phpBB forum. NOT interegrated, just linked to from my joomla to clear that up.

-Thanks
Back to top
Visit poster's website
dcz
Administrateur - Site Admin
Administrateur - Site Admin


Joined: 28 Apr 2006
Posts: 15242

URLs restricted by robots.txtPosted: Sat Aug 26, 2006 10:26 am    Post subject: Re: URLs restricted by robots.txt

Working on a new design, nice Wink

So everything seems normal here.

We have to disallow post as they are the messages URLs (viewtopic.php?p=), the number one duplicate source in a phpBB forum.

Same for natural URLs vewtopic.php ...


Then, the message about accessing your home page is nothing. Maybe the bot could not load the page once (network failure or server reboot) and the stats are not updated yet (it's far from being live).
Because your home page is cached.

I would not worry about any of those, especially your robots.txt. Allowing post URLs can make one think it's faster for indexing, and it is some how true, at least without mx Google sitemaps, as the message URL do show up on the forum index (one page before the topic's in depth).

But, this is not a good thing to do, because all those are going to be full or partial duplicates of the topic's URLs (at least same title).
The result would be a lot more cached page, but a lot fewer PageRank (because of the content dilution) and thus less good results in search engines results.
One URL, one page, one title, the simple the better (bots are not as smart as we are, we'd better not confuse them).

++

_________________
Useful links :
SEO Forum || SEO Directory || SEO phpBB || SEO phpBB3 || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Référencement phpBB3 || Recherche
Back to top
Visit poster's website
Silverado05
PR0
PR0


Joined: 01 Jul 2006
Posts: 51
Location: Texas

URLs restricted by robots.txtPosted: Sat Aug 26, 2006 9:48 pm    Post subject: Re: URLs restricted by robots.txt

So to sum it up then it would be best to leave things as it is and keep Disallow: /forum/post and/or Disallow: /forum/viewtopic.php those on the robot.txt?

Also another interesting thing I saw was Googlebot had successfully accessed my home page on Dec 31, 1969? 1969??? LoL Google wasn't even around then, much less the internet.
Back to top
Visit poster's website
dcz
Administrateur - Site Admin
Administrateur - Site Admin


Joined: 28 Apr 2006
Posts: 15242

URLs restricted by robots.txtPosted: Sun Aug 27, 2006 12:33 am    Post subject: Re: URLs restricted by robots.txt

1969?
You know what, I always new Google was a rock star Laughing


This reminds us Google sitemaps is still beta, it works nice, but some things like this can happen.

++

_________________
Useful links :
SEO Forum || SEO Directory || SEO phpBB || SEO phpBB3 || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Référencement phpBB3 || Recherche
Back to top
Visit poster's website
lavinya
PR1
PR1


Joined: 24 Jul 2006
Posts: 161
Location: Turkey

URLs restricted by robots.txtPosted: Thu Sep 07, 2006 3:24 pm    Post subject: Re: URLs restricted by robots.txt

vavvv

1969 Shocked
Back to top
Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    phpBB SEO » SEO Forum  » Google Forums
Page 1 of 1

Navigation Similar Topics

Jump to: