| |
|
| :: |
| Author |
Message |
HB phpBB SEO Team

Joined: 16 Oct 2006 Posts: 795
|
Posted: Mon Feb 05, 2007 2:12 am Post subject: Migrating old style url references to rewritten URLs |
|
|
For those who have sites with lots of internal links, have you considered migrating the viewtopic.php style [url] and clickable links to the rewritten format to save redirections? I point site visitors to existing content all the time and there's subsequently a lot of references. They'll see slower response times if forced to do extra round trips, and more importantly, those pesky bots who mindlessly follow every link will burn more server cycles only to discover there's no change.
Seriously, I've looked at google's webmaster crawl stats for my site. Even with sitemaps and a gaggle of robots.txt rules, googlebot burns a hole in my server's bandwidth a mile wide. And Yahoo's bot? Sheesh, it blindly re-reads threads that haven't changed in over a year with irritating regularity. I'm a little worried all this redirection is going to hurt the site's overall performance.
EDIT: Wrong forum, please move to 'zero dupe'... sorry about that. |
_________________ Dan Kehn |
|
| Back to top |
|
 |
|
 |
dcz Administrateur - Site Admin

Joined: 28 Apr 2006 Posts: 13354
|
Posted: Mon Feb 05, 2007 9:13 am Post subject: Re: Migrating old style url references to rewritten URLs |
|
|
Well yes it can be a good thing, and we are thinking about something, but the dev did not start yet.
But do not worry too much, HTTP 301 really means removed permanently, so once you'll start redirecting bots, they'll figure out quite quick they need to forgot about the older URLs.
On phpBB SEO, when I took back both forums to the www domain, I sow Google and Yahoo crawling like mad bots the first day, exited the second, and mostly back to normal the third.
The storm can be a bit violent the first day, but it does not last.
The mod was installed on pretty big forums, bigger than here, and it worked nicely.
To me, if your server makes it to load your pages, it should do it to redirect all, as the bigger the forum, the bigger the server
++ |
_________________ Useful links :
SEO Forum || SEO Directory || SEO phpBB || SEO phpBB3 || Search
____________________
Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Référencement phpBB3 || Recherche |
|
| Back to top |
|
 |
HB phpBB SEO Team

Joined: 16 Oct 2006 Posts: 795
|
Posted: Mon Feb 05, 2007 12:11 pm Post subject: Re: Migrating old style url references to rewritten URLs |
|
|
| dcz wrote: | | The storm can be a bit violent the first day, but it does not last. |
Oh, are you saying that if a bot sees -www.example.com/forums/viewtopic.php?t=1234 and gets 301 redirected to -www.example.com/forums/blah-blah-blah-t1234.html that it will update its index and record that if it sees the same URL again, "don't bother following it"? It makes sense that a smart bot would do both. If that's true, then eventually the incoming traffic from "old" format URLs would only be user clicks, never bots. Interesting. |
_________________ Dan Kehn |
|
| Back to top |
|
 |
dcz Administrateur - Site Admin

Joined: 28 Apr 2006 Posts: 13354
|
|
| Back to top |
|
 |
HB phpBB SEO Team

Joined: 16 Oct 2006 Posts: 795
|
Posted: Tue Feb 06, 2007 9:08 pm Post subject: Re: Migrating old style url references to rewritten URLs |
|
|
| dcz wrote: | | Exactly, it does not mean the older link will stop being taking into account by bots, but they'll acknowledge the fact that the older one was updated and will quite fast use the new one only. |
Understood. Bots don't seem to take webmasters' word for it when they say "moved permanently"... they'll update their index, but if they see an "old" link, I would expect the bot to check again, if only because the webmaster could have changed their mind.
I'm not worried about the performance hit of incoming old format links, since they're not that plentiful. But within the site, there's lots and lots of old format links, so the bots are going to burn more time chasing zero dup'd links. The choices I see are (a) ignore it and hope the performance cost is minimal, or (b) write code to scan and translate existing inter-site references to the new format so Mr. Googlebot and his buddy Mr. Slurp are more efficient.
Imponderable thought of the day: Why do they need to transverse practically 1/3 of all threads EVERY SINGLE DAY... c'mon guys, look at the sitemap timestamps, pretty please? What's even more irritating is googlebot reads them and then takes weeks to update the actual index content; slurp at least does both according to its site explorer. |
_________________ Dan Kehn |
|
| Back to top |
|
 |
dcz Administrateur - Site Admin

Joined: 28 Apr 2006 Posts: 13354
|
Posted: Tue Feb 06, 2007 10:56 pm Post subject: Re: Migrating old style url references to rewritten URLs |
|
|
Well, many criteria does play their role here.
First, you can slow down a bit the Google bot crawling in your sitemaps account, even though I do think the best is to let Google decide by himself.
The older links will continue to be followed by bots, but they'll acknowledge they need to take the target into account for indexing, so that they should not add it again in their planned crawl.
At this stage, there are two possibilities, given that bots follow 301 redirection as they should (and it's the case ) they can either keep a translation table up an remember about redirected links when they crawl a page, or just continue to occasionally crawl the older ones, when they crawled a page and start to follow links in them.
I do think it's the second one the most likely to be used, It just mean they won't add these url in their planned crawling and that they won't list it in their SERP.
The link only exist to them where it is actually online, there is no persistence.
Because translation table from URL standard could end up being impossible to deal with.
And in both cases, they should never hit an old links as a first try nor bring visitor to it.
So yes, there will always be some redirecting, as long as the old links are still online, but if you consider all the dynamic links have changed and that all the new post will use the new ones, and that bots will never start crawling on these but will most likely end up following a link found in a page, its not this bad.
A script to edit the db could be really handy, and I do plan to work on such script, just a matter of time  |
_________________ Useful links :
SEO Forum || SEO Directory || SEO phpBB || SEO phpBB3 || Search
____________________
Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Référencement phpBB3 || Recherche |
|
| Back to top |
|
 |
HB phpBB SEO Team

Joined: 16 Oct 2006 Posts: 795
|
Posted: Tue Feb 06, 2007 11:57 pm Post subject: Re: Migrating old style url references to rewritten URLs |
|
|
| dcz wrote: | A script to edit the db could be really handy, and I do plan to work on such script, just a matter of time  |
Better yet, do a quick scan of the post text in viewtopics.php. If it contains an old style URL (http:// + server + script path + viewtopic | viewforum), correct it on the fly and write the post text back to the database. The topics would be automatically migrated as they are browsed by users / bots.
If you want extra credit, add an ACL option to check URLs for consistency from time-to-time. That is, the same check as before except for http:// + server + script path + blah-blah-blah-t1234.html if the post is less than a few days old (assuming it's more likely to change early on because of moderators correcting poor title choices). If the title has changed, write the updated post text back to the database.
That would isolate the "migration" code to one place and be auto-correcting. |
_________________ Dan Kehn |
|
| Back to top |
|
 |
dcz Administrateur - Site Admin

Joined: 28 Apr 2006 Posts: 13354
|
|
| Back to top |
|
 |
HB phpBB SEO Team

Joined: 16 Oct 2006 Posts: 795
|
Posted: Wed Feb 07, 2007 1:27 am Post subject: Re: Migrating old style url references to rewritten URLs |
|
|
| dcz wrote: | What is another thing is dealing with big db, and time as always  |
That's my point, doing it au fur et à mesure with a quick scan of the current text for viewtopic.php / viewforum.php costs next to nothing for misses and benefits big boards the most because you won't bother touching the database for "old" threads that aren't read. A mass migration of a big board could take a long, long time. The recent timestamp check would cover the case where titles are updated early in the thread's life and never thereafter, again benefitting big boards the most.
On the other hand, if you want to leverage a good step-wise migration, check out Rebuild Search. That is a slick implementation that could be leveraged for those like me who rebuild the search tables from time-to-time (I do it to update the stopwords and add synonyms so search / related topics work optimally). |
_________________ Dan Kehn |
|
| Back to top |
|
 |
dcz Administrateur - Site Admin

Joined: 28 Apr 2006 Posts: 13354
|
|
| Back to top |
|
 |
|
|
| Navigation |
Similar Topics |
|
|
|
|
|
|
|