URLs restricted by robots.txt

Discussion about the Google Search Engine. SEO, PageRank, AdSense, AdWords, Use and other Services.

Moderator: Moderators

URLs restricted by robots.txt

Postby Silverado05 » Sat Aug 26, 2006 9:32 am

Ok, I was going through my google sitemaps account checkout the latest data. While I have over 11,000 pages indexed and they seem to be vauable content It notfied me that I have (1107) URLs restricted by robots.txt. Now I was scanning the URLs it restricted and some are legit like member profiles, etc. But I also noticed it is restricting the URL's of acutally posts I.E.

http://www.texascampingforum.com/forum/post358.html

So wouldn't that need to be unresctriced as that is vaulable content? Thier is several 100 URL's like that, that go to post.

Here is the robot text:

Code: Select all
User-agent: *
Disallow: /administrator/
Disallow: /cache/
Disallow: /components/
Disallow: /editor/
Disallow: /help/
Disallow: /images/
Disallow: /includes/
Disallow: /language/
Disallow: /mambots/
Disallow: /media/
Disallow: /modules/
Disallow: /templates/
Disallow: /installation/

Disallow: /forum/viewtopic.php
Disallow: /forum/viewforum.php
Disallow: /forum/index.php?
Disallow: /forum/posting.php
Disallow: /forum/groupcp.php
Disallow: /forum/search.php
Disallow: /forum/login.php
Disallow: /forum/post
Disallow: /forum/member
Disallow: /forum/profile.php
Disallow: /forum/memberlist.php
Disallow: /forum/faq.php



So would you recommend removing Disallow: /forum/post and/or Disallow: /forum/viewtopic.php or will that just cause google to index junk?

Also I might add it said "We can't currently access your home page because of a robots.txt restriction." but yet I am getting indexed and the line before that says this, " Googlebot has successfully accessed your home page. Last crawl date: Dec 31, 1969". Any ideas as to why, FYI my home page is Joomla CMS which links to phpBB forum. NOT interegrated, just linked to from my joomla to clear that up.

-Thanks
Silverado05
PR0
PR0
 
Posts: 51
Joined: Sat Jul 01, 2006 8:38 pm
Location: Texas

Advertisement

Postby dcz » Sat Aug 26, 2006 10:26 am

Working on a new design, nice ;)

So everything seems normal here.

We have to disallow post as they are the messages URLs (viewtopic.php?p=), the number one duplicate source in a phpBB forum.

Same for natural URLs vewtopic.php ...


Then, the message about accessing your home page is nothing. Maybe the bot could not load the page once (network failure or server reboot) and the stats are not updated yet (it's far from being live).
Because your home page is cached.

I would not worry about any of those, especially your robots.txt. Allowing post URLs can make one think it's faster for indexing, and it is some how true, at least without mx Google sitemaps, as the message URL do show up on the forum index (one page before the topic's in depth).

But, this is not a good thing to do, because all those are going to be full or partial duplicates of the topic's URLs (at least same title).
The result would be a lot more cached page, but a lot fewer PageRank (because of the content dilution) and thus less good results in search engines results.
One URL, one page, one title, the simple the better (bots are not as smart as we are, we'd better not confuse them).

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21428
Joined: Fri Apr 28, 2006 9:03 pm

Postby Silverado05 » Sat Aug 26, 2006 9:48 pm

So to sum it up then it would be best to leave things as it is and keep Disallow: /forum/post and/or Disallow: /forum/viewtopic.php those on the robot.txt?

Also another interesting thing I saw was Googlebot had successfully accessed my home page on Dec 31, 1969? 1969??? LoL Google wasn't even around then, much less the internet.
Silverado05
PR0
PR0
 
Posts: 51
Joined: Sat Jul 01, 2006 8:38 pm
Location: Texas

Postby dcz » Sun Aug 27, 2006 12:33 am

1969?
You know what, I always new Google was a rock star :lol:


This reminds us Google sitemaps is still beta, it works nice, but some things like this can happen.

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21428
Joined: Fri Apr 28, 2006 9:03 pm

Postby lavinya » Thu Sep 07, 2006 3:24 pm

vavvv

1969 :shock:
User avatar
lavinya
PR1
PR1
 
Posts: 167
Joined: Mon Jul 24, 2006 9:05 am
Location: Turkey


Return to Google Forum

 


  • Related topics
    Replies
    Views
    Last post

Who is online

Users browsing this forum: No registered users and 17 guests