Google caching of phpbb forum with advanced mod rewrite

Discussions about SEO Techniques implemented in your sites. Tests, Studies and results analysis.

Moderator: Moderators

Google caching of phpbb forum with advanced mod rewrite

Postby arch_stanton » Wed Dec 13, 2006 11:01 pm

Hi, dcz, I originally posted this in the Google Sitemaps thread and you asked me to start a new thread in here.

How long does it typically take for Google to index a site once a proper sitemap has been uploaded?

In the past it has taken a couple of months from moving to a new domain for the site to be fully indexed and cached.

At the moment the new URLs aren't showing up and the old ones are of course blocked by robots.txt

I have seen GoogleBot spidering my site since uploading the sitemap (and it is regularly checking the sitemap as it should) but it only stays on the main index page at the moment, it doesn't appear to be accessing the other forums yet.

I'm just getting a bit nervous, as all my old URLs are dropping off Google's index.


You replied:

Did you implemented some HTTP 301 redirection while changing domain ?

If not, it could still be interesting to add some in order to keep the old backlinks.

Anyway, with PR3, it usually takes about one month to see all pages listed in the sitemaps in Google's cache.
The pattern is nothing for a month, and all the sudden, all pages showing up.

If your old URLs are still listed in Google's cache, then, there is a urge to HTTP 301 redirect them.
I was able to change both phpBB SEO forum's URL without even a single visit lost from the Search Engines. So it may be still time to save some.
If it's only phpBB related, please start a new thread in the phpBB forum, if it's about another script, please post in the SEO techniques forum instead.

And we'll see what's doable


OK, it's now two weeks since the friendly URLs and the sitemap were implemented. So far, the only new URLs showing are forums and threads that happen to be linked to from the home page. I know you said it takes a month, I hope you're right that they will all suddenly appear... :shock:

You asked if I had a 301 redirect from the old domain. I'm not sure if it was a 301 but yes, the old domain had some form of redirect and the old URLs are still indexed.

Google also still has the old URLs from the current domain (viewtopic.php?t=xxx type) indexed as well, but they are no longer cached because GoogleBot is blocked from seeing them by robots.txt.

The problem is that there are a few pages that are still linked to from other sites, so obviously I would like Google to be able to "see" these so I can restore the PageRank for these pages. Is there a way of telling Google that these pages have changed?

So if it follows a link from another site to -http://www.example.co.uk/forum/viewtopic.php?t=285 - is there a way to tell it that the page is now at this link? Some of these older pages have far more external links to them because I have only recently switched to friendly URLs.

Thanks.
Last edited by arch_stanton on Sun Jan 21, 2007 1:13 am, edited 1 time in total.
arch_stanton
PR1
PR1
 
Posts: 163
Joined: Wed Oct 04, 2006 9:48 am

Advertisement

Postby dcz » Thu Dec 14, 2006 12:32 am

arch stanton wrote:I'm not sure if it was a 301 but yes, the old domain had some form of redirect and the old URLs are still indexed.


This you should make sure about. Use http://web-sniffer.net/ to check the header returned. Can you change the dns for the old domain, is it linked to some hosting ?

then, only use this as a robots.txt for phpbb :

Code: Select all
User-agent: *
Disallow: /forum/viewtopic.php?p=
Disallow: /forum/viewforum.php?mark=
Disallow: /forum/index.php?
Disallow: /forum/posting.php
Disallow: /forum/groupcp.php
Disallow: /forum/profile.php
Disallow: /forum/memberlist.php
Disallow: /forum/search.php
Disallow: /forum/login.php
Disallow: /forum/faq.php


And I PM you the zero duplicate right now ;)

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21424
Joined: Fri Apr 28, 2006 9:03 pm

Postby arch_stanton » Thu Dec 14, 2006 5:35 pm

dcz, as I still own the old domain, I am just using web forwarding from the registry until it expires. I typed the old domain (gladetalk.org.uk) into web-sniffer and got the following response:

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<HTML><HEAD>
<TITLE>302 Found</TITLE>
</HEAD><BODY>
<H1>Found</H1>
The document has moved <A HREF="http://www.gladtalk.co.uk/">here</A>.<P>
</BODY></HTML>


Thank you for the script. What exactly does it do?

And what do I do about old-style gladtalk.co.uk URLs (i.e. ones on the current domain) like the one I mentioned in the first post? These have several inbound links - but of course Googlebot can't see them in this form because viewtopic URLs are blocked in robots.txt. Is there a way of fooling Googlebot into "seeing" them in friendly URL format?
arch_stanton
PR1
PR1
 
Posts: 163
Joined: Wed Oct 04, 2006 9:48 am

Postby dcz » Thu Dec 14, 2006 6:13 pm

Well, if your old domain had backlinks, it could be interesting to keep it a bit more time and to properly redirect it.

An easy way to do this, since it seems you're paying for it, you should be able to change DNS for it.
The idea would be to link it to your new hosting, if possible.

Then, the www prefix redirection (version 2) would do the rest where your new domain is installed.
The www redirection will as well make sure it's the correct and unique domain to link here, so with it your old one will be redirected.

Talking a bout how long should you maintain the old domain, well it depend, mostly on the backlinks. If not too expensive, I'd keep it at least a year, and even more.

Then, about your first issue, yes the zero dupe will take care of it.
Check the examples in the zero dupe thread ;)

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21424
Joined: Fri Apr 28, 2006 9:03 pm

Postby arch_stanton » Thu Dec 14, 2006 6:33 pm

dcz, just to clarify re the backlinks to threads on the existing domain:

For example, -http://www.gladtalk.co.uk/forum/viewtopic.php?t=xxx
which Google can't see at the moment.

If I change this line in robots.txt from "Disallow: /forum/viewtopic.php" to "Disallow: /forum/viewtopic.php?p=", will this allow GoogleBot to see the backlinked thread again? And if I run the zero dupes script, this will prevent any dupe penalties?
arch_stanton
PR1
PR1
 
Posts: 163
Joined: Wed Oct 04, 2006 9:48 am

Postby dcz » Thu Dec 14, 2006 6:40 pm

Exactly, this kind of url will be HTTP 301 redirected, and bots know this means they should forget about the old one and rather use the new one from then.

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21424
Joined: Fri Apr 28, 2006 9:03 pm

Postby arch_stanton » Thu Dec 14, 2006 6:49 pm

dcz wrote:Exactly, this kind of url will be HTTP 301 redirected, and bots know this means they should forget about the old one and rather use the new one from then.

++

Yeah, but if it forgets the backlinked ones, I lose the page ranking for those pages surely?

The thread I mentioned in the opening post has lots of backlinks in the viewtopic.php?t=xxx form but none in the friendly URL form, so it's getting no ranking at all.
arch_stanton
PR1
PR1
 
Posts: 163
Joined: Wed Oct 04, 2006 9:48 am

Postby dcz » Thu Dec 14, 2006 9:32 pm

Actually, PR and backlinks are transmitted with HTTP 301 redirections. That's the whole point about it.

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21424
Joined: Fri Apr 28, 2006 9:03 pm

Postby arch_stanton » Sun Dec 17, 2006 11:54 am

OK, dcz, have finally implemented zero dupe and changed the robots.txt. Will let you know how I get on. :)
arch_stanton
PR1
PR1
 
Posts: 163
Joined: Wed Oct 04, 2006 9:48 am

Postby dcz » Sun Dec 17, 2006 12:08 pm

To accurately track your results, you can post a small report ;)

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21424
Joined: Fri Apr 28, 2006 9:03 pm

Postby arch_stanton » Mon Dec 18, 2006 10:56 pm

dcz wrote:To accurately track your results, you can post a small report ;)

++

Will do, but in the meantime I'm a bit concerned that GoogleBot is still only looking at the forum index after two weeks. Shouldn't it be looking at the other forums and subforums by now?

Inktomi and MSN have got used to the friendly URLs and are spidering all the forums, so why is Google ignoring the sitemap?
arch_stanton
PR1
PR1
 
Posts: 163
Joined: Wed Oct 04, 2006 9:48 am

Postby dcz » Mon Dec 18, 2006 10:58 pm

Well, it can be longer than than with Google especially if you experienced some trouble while migrating domain, but, it should index your site, and usually, the Google sitemaps effect is to be waited for like a month after first submitting a working sitemap.

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21424
Joined: Fri Apr 28, 2006 9:03 pm

Postby arch_stanton » Tue Dec 19, 2006 1:25 pm

dcz wrote:then, only use this as a robots.txt for phpbb :

Code: Select all
User-agent: *
Disallow: /forum/viewtopic.php?p=
Disallow: /forum/viewforum.php?mark=
Disallow: /forum/index.php?
Disallow: /forum/posting.php
Disallow: /forum/groupcp.php
Disallow: /forum/profile.php
Disallow: /forum/memberlist.php
Disallow: /forum/search.php
Disallow: /forum/login.php
Disallow: /forum/faq.php

dcz, shouldn't I also have forum/member and forum/post disallowed in the robots.txt?
arch_stanton
PR1
PR1
 
Posts: 163
Joined: Wed Oct 04, 2006 9:48 am

Postby dcz » Tue Dec 19, 2006 3:44 pm

Well, this is indeed suggested in the mod rewrite you're using release thread ;)
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21424
Joined: Fri Apr 28, 2006 9:03 pm

Postby arch_stanton » Tue Dec 19, 2006 10:58 pm

dcz wrote:Well, this is indeed suggested in the mod rewrite you're using release thread ;)

Oh.

When you said "only use this as a robots.txt for phpbb", I took that literally to mean that only those lines should be in the robots.txt and nothing else.

I have re-inserted the forum/member and forum/post lines. Are there any other ones that I should put back in?
arch_stanton
PR1
PR1
 
Posts: 163
Joined: Wed Oct 04, 2006 9:48 am

Next

Return to SEO Techniques

 


  • Related topics
    Replies
    Views
    Last post

Who is online

Users browsing this forum: No registered users and 4 guests