| |
| |
|
|
|
|
| |
|
| |
|
| :: |
| Author |
Message |
Silverado05 PR0

Joined: 01 Jul 2006 Posts: 51 Location: Texas
|
Posted: Sat Aug 26, 2006 9:32 am Post subject: URLs restricted by robots.txt |
|
|
Ok, I was going through my google sitemaps account checkout the latest data. While I have over 11,000 pages indexed and they seem to be vauable content It notfied me that I have (1107) URLs restricted by robots.txt. Now I was scanning the URLs it restricted and some are legit like member profiles, etc. But I also noticed it is restricting the URL's of acutally posts I.E.
http://www.texascampingforum.com/forum/post358.html
So wouldn't that need to be unresctriced as that is vaulable content? Thier is several 100 URL's like that, that go to post.
Here is the robot text:
| Code: | User-agent: *
Disallow: /administrator/
Disallow: /cache/
Disallow: /components/
Disallow: /editor/
Disallow: /help/
Disallow: /images/
Disallow: /includes/
Disallow: /language/
Disallow: /mambots/
Disallow: /media/
Disallow: /modules/
Disallow: /templates/
Disallow: /installation/
Disallow: /forum/viewtopic.php
Disallow: /forum/viewforum.php
Disallow: /forum/index.php?
Disallow: /forum/posting.php
Disallow: /forum/groupcp.php
Disallow: /forum/search.php
Disallow: /forum/login.php
Disallow: /forum/post
Disallow: /forum/member
Disallow: /forum/profile.php
Disallow: /forum/memberlist.php
Disallow: /forum/faq.php |
So would you recommend removing Disallow: /forum/post and/or Disallow: /forum/viewtopic.php or will that just cause google to index junk?
Also I might add it said "We can't currently access your home page because of a robots.txt restriction." but yet I am getting indexed and the line before that says this, " Googlebot has successfully accessed your home page. Last crawl date: Dec 31, 1969". Any ideas as to why, FYI my home page is Joomla CMS which links to phpBB forum. NOT interegrated, just linked to from my joomla to clear that up.
-Thanks |
|
|
| Back to top |
|
 |
|
 |
dcz Administrateur - Site Admin

Joined: 28 Apr 2006 Posts: 15242
|
Posted: Sat Aug 26, 2006 10:26 am Post subject: Re: URLs restricted by robots.txt |
|
|
Working on a new design, nice
So everything seems normal here.
We have to disallow post as they are the messages URLs (viewtopic.php?p=), the number one duplicate source in a phpBB forum.
Same for natural URLs vewtopic.php ...
Then, the message about accessing your home page is nothing. Maybe the bot could not load the page once (network failure or server reboot) and the stats are not updated yet (it's far from being live).
Because your home page is cached.
I would not worry about any of those, especially your robots.txt. Allowing post URLs can make one think it's faster for indexing, and it is some how true, at least without mx Google sitemaps, as the message URL do show up on the forum index (one page before the topic's in depth).
But, this is not a good thing to do, because all those are going to be full or partial duplicates of the topic's URLs (at least same title).
The result would be a lot more cached page, but a lot fewer PageRank (because of the content dilution) and thus less good results in search engines results.
One URL, one page, one title, the simple the better (bots are not as smart as we are, we'd better not confuse them).
++ |
_________________ Useful links :
SEO Forum || SEO Directory || SEO phpBB || SEO phpBB3 || Search
____________________
Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Référencement phpBB3 || Recherche |
|
| Back to top |
|
 |
Silverado05 PR0

Joined: 01 Jul 2006 Posts: 51 Location: Texas
|
Posted: Sat Aug 26, 2006 9:48 pm Post subject: Re: URLs restricted by robots.txt |
|
|
So to sum it up then it would be best to leave things as it is and keep Disallow: /forum/post and/or Disallow: /forum/viewtopic.php those on the robot.txt?
Also another interesting thing I saw was Googlebot had successfully accessed my home page on Dec 31, 1969? 1969??? LoL Google wasn't even around then, much less the internet. |
|
|
| Back to top |
|
 |
dcz Administrateur - Site Admin

Joined: 28 Apr 2006 Posts: 15242
|
|
| Back to top |
|
 |
lavinya PR1


Joined: 24 Jul 2006 Posts: 161 Location: Turkey
|
Posted: Thu Sep 07, 2006 3:24 pm Post subject: Re: URLs restricted by robots.txt |
|
|
vavvv
1969  |
|
|
| Back to top |
|
 |
|
|
| Navigation |
Similar Topics |
|
|
|
|
|
|
|
| |
|
|
|
|
| |
|
|
|
|
| |