Locking the www in urls

Discussions about SEO Techniques implemented in your sites. Tests, Studies and results analysis.

Moderator: Moderators


Locking the www in urls

Postby dcz » Fri May 19, 2006 9:49 pm

I thought it could be wise to start talking a bit about the www in URL.

The WWW :
    First, http://www.example.com is an actual sub domain of example.com, even if it is most of the time defined by default exactly as the main domain.

    The problem is, the domain will as well always work, so that one can always load example.com and http://www.example.com if nothing is done.

    Here come the Search Engine Optimisation matter, those two are duplicates, and worst, by default, every URL of a site will have a duplicate, because you can always keep or get rid of the www.
    The www sub domain is linked to the same hosting as the domain.

    Some bots do try to get rid of the www to load pages, even if they did follow a link with the www. And people can post such links too, so that in the end, if you do nothing, you are very likely to find duplicate indexing in search engines results and to end up with smaller Page Ranking.

    The solution is to redirect with a 301 http header to only use one of the two.

what to choose ?
    Internet is all about standard, Search Engines Bots do follow standards, or at least are built according to them. So the obvious choice here, is to always keep the www.

    And it's not only about Bots understanding this is a main domain and not a sub (thus more important ?).
    Did I say standards ? http://www.example.com vs example.com . Which one will have more weight do you think ?

    PhpBB is not the only place to prefer www, almost every form able to auto create links will do it using a RegEx based on the www, no www, no auto active link ;)
    One would have to post http://example.com , which you will admit is less probable.

    Some site even add the www in their sub domains, but it can become a bit long in the end, which is something to take care of as well in URLs.

The solutions :
    Apache mod Rewrite :

      As usual, Apache mod rewrite is the perfect solution.
      The idea here is to first check if the www is present in the requested URL and if not to add it through a 301 http redirection. User won't even notice the difference, Bots will "know" that the www is the unique URL to use.

      The problem here is several approaches are possible and working solution depends on some server settings.

      First we can either test if the www is or is not present by checking if your domain URL is different from http://www.example.com or equal to example.com.
      Then, since the "." is a special character for mod rewrite, we should be escaping it with an "\" but, I did see some examples on which it was not allowing the rule to work for all of the web site's sub folders.
      These differences mostly comes from hosting companies building custom releases of Apache and mod rewrite.

      In the end, one have to perform some quick testing with the following examples.

      The proposed RewriteCond and RewriteRule should of course be placed in your root's .htaccess, right after :
      Code: Select all
      RewriteEngine on


      First method : URL = example.com.

      Code: Select all
      RewriteCond %{HTTP_HOST} ^example\.com$ [NC]
      RewriteRule ^(.*)$ http://www.example.com/$1 [QSA,L,R=301]


      Note that if you experience problem with this code and your server, you can try getting rid of the "\", the $ and the [NC] in the RewriteCond.
      The advantages of this method are, you won't mess with sub domains that could be physically hosted within the main domain ftp account.

      Second method : URL !=www.example.com.

      Code: Select all
      RewriteCond %{HTTP_HOST} !^www\.example\.com$ [NC]
      RewriteRule ^(.*)$ http://www.example.com/$1 [QSA,L,R=301]


      The same thing apply here withe the "\", the $ and the [NC] in the RewriteCond.
      If for some reason this is the only working solution for your site and you have to deal with sub domains, then, you can just tweak the RewriteCond as follow :
      Code: Select all
      RewriteCond %{HTTP_HOST} !^(www|sub1|sub2|sub3)\.example\.com$ [NC]

      And keep the rewriterule. Anything not showing up in the list will be redirected to http://www.example.com/uri with a 301 http header.

    PHP redirection :

      For those not lucky enough to run Apache with mod rewrite, two solutions :
      1) Change hosting :)
      2) Try using the following php script ;)

      Code: Select all
      $req_uri = $_SERVER['REQUEST_URI'];
      $req_domain = $_SERVER['HTTP_HOST'];
      $sub = substr( $domaine, 0, strpos($domaine,".") );
      if ($sub != 'www')
      {
         header("Status: 301 Moved Permanently", false, 301);
         header("Location: http://www.example.com/$req_uri");
         exit();
      }


      But this could mess up with some processes if not applied early enough. For phpBB, a good place should be in common.php, before the "?>".

    You can also apply the same principles to force the URL not to use the www, even if it is certainly less interesting as far as Search Engine Optimization (SEO).


Last edited by dcz on Fri Mar 16, 2007 12:16 pm, edited 3 times in total.
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 19930
Joined: Fri Apr 28, 2006 9:03 pm

Advertisement

Postby Peter77 » Sat May 20, 2006 1:20 am

dcz wrote:
Hope you'll find a working turnaround to lock the www, since in the end it really can make the difference in Page Ranking and thus search engines results.


Well now I'm worried. every type of RewriteCond works fine. but once I try differn't rewriterule, my site goes crazy... like i've mentioned before. I tried all combinations of the rewriterule, without the $ [NC] and \ ect

And I tried the fix for common.php but that one redirects every single URL of phpbb back to my portal. obvisouly my sever supports modrewrite otherwise id probably wouldn't be getting SEO friendly urls, i think. any more suggestions?
User avatar
Peter77
phpBB SEO Team
phpBB SEO Team
 
Posts: 520
Joined: Wed May 10, 2006 9:46 am
Location: Michigan

Postby dcz » Sat May 20, 2006 1:37 am

Well, the www thing does apply to every type of URL, the duplicate matter will concern all cases.

By the way, I noticed a small typo in the php script, even though I am sure we can make it with mod rewrite. It's just that we have to find out how :D

So did you try with :

Code: Select all
Options +FollowSymlinks
RewriteEngine on
RewriteBase /



instead of just :
Code: Select all
RewriteEngine on


And, another thing, try to only use on single .htaccess file at the root level for all rewriterules.

Keep faith ;) , we'll make it :D
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 19930
Joined: Fri Apr 28, 2006 9:03 pm

Postby Peter77 » Sat May 20, 2006 1:51 am

dcz wrote:Well, the www thing does apply to every type of URL, the duplicate matter will concern all cases.

By the way, I noticed a small typo in the php script, even though I am sure we can make it with mod rewrite. It's just that we have to find out how :D

So did you try with :

Code: Select all
Options +FollowSymlinks
RewriteEngine on
RewriteBase /



instead of just :
Code: Select all
RewriteEngine on


And, another thing, try to only use on single .htaccess file at the root level for all rewriterules.

Keep faith ;) , we'll make it :D


Yep, that's what I have in my .htaccess file.

Code: Select all
Options +FollowSymlinks
RewriteEngine on
RewriteBase /


I tried with just RewriteEngine on and all three lines. still the same. I did have other .htacess files in both mxbb and phpbb includes folders ( deny from all ) and I even deleted those and still same thing.
where is the typo at?
User avatar
Peter77
phpBB SEO Team
phpBB SEO Team
 
Posts: 520
Joined: Wed May 10, 2006 9:46 am
Location: Michigan

Postby dcz » Sat May 20, 2006 1:52 am

was on this line :

Code: Select all
   header("Location: http://www.example.com/$req_uri");


The deny should not bother.
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 19930
Joined: Fri Apr 28, 2006 9:03 pm

Postby sceltic » Sat May 20, 2006 2:04 pm

Code: Select all
<Files .htaccess>
Order allow,deny
deny from all
</Files>
Options +FollowSymlinks
RewriteEngine On
RewriteBase /


How about that to start?
sceltic
PR0
PR0
 
Posts: 55
Joined: Thu May 04, 2006 4:07 pm

Postby dcz » Sat May 20, 2006 2:12 pm

The important thing here is that the used code does the job we want.

Code: Select all
<Files .htaccess>
Order allow,deny
deny from all
</Files>



is only needed if your server setting are allowing to view the .htaccess content. .htaccess files should not be viewable nor editable unless through ftp access.
There could be some useful info such as path to .htapsswd for hackers, and worst if one can edit it.

Just make sure you cannot load www.example.com/.htaccess and that the .htaccess in not chmoded to 777.

Then, for the rest of the code you posted, if it works like this, then it's OK.
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 19930
Joined: Fri Apr 28, 2006 9:03 pm

Postby sceltic » Sat May 20, 2006 6:23 pm

Sorry for posting in two places

This might be working! but do you think it looks ok?

Code: Select all
RewriteEngine on
RewriteCond %{HTTP_HOST} !^www [NC]
RewriteRule (.*) http://www.abc.com/$1 [R=301,L]


followed of course by the specific forum rules


what does the extra ^ before the (.*) do?
sceltic
PR0
PR0
 
Posts: 55
Joined: Thu May 04, 2006 4:07 pm

Postby dcz » Sat May 20, 2006 6:31 pm

sceltic wrote:Sorry for posting in two places

This might be working! but do you think it looks ok?

Code: Select all
RewriteEngine on
RewriteCond %{HTTP_HOST} !^www [NC]
RewriteRule (.*) http://www.abc.com/$1 [R=301,L]


followed of course by the specific forum rules


what does the extra ^ before the (.*) do?


Well, you need more than just www in the RewriteCond, but beside is can work as well. I mean, the code without the ^ works as well on my servers, but it is more strict to put it.

If it ends ups only working without in your case, then, you got it ;)
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 19930
Joined: Fri Apr 28, 2006 9:03 pm

Postby Peter77 » Sat May 20, 2006 7:13 pm

dcz, What could be the problem with my site not accepting the RewriteRule? is it bad hosting? maybe I can write to support of my hosting.. maybe there needs to be something done in order for it to work on my site?
User avatar
Peter77
phpBB SEO Team
phpBB SEO Team
 
Posts: 520
Joined: Wed May 10, 2006 9:46 am
Location: Michigan

Postby dcz » Sat May 20, 2006 7:18 pm

Well, before that, try all combination :D Have you tried without the ^.

I know this can look confusing, but this particular matter is hard to deal with, especially with many different server settings. Anyway, since mod rewrite is on, there must be a solution to achieve this goal.

Don't worry, we have time, this is not like the worst thing that can happen if for some time, you see some URLs without the www to be used on your site.

It's just something to take care of, because it will be better once done.
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 19930
Joined: Fri Apr 28, 2006 9:03 pm

Postby Peter77 » Sat May 20, 2006 10:06 pm

I think this works!

Code: Select all
Options +FollowSymlinks
RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_HOST} ^site.com/  [NC]
RewriteRule ^(.*) http://www.site.com/$1 [QSA,R=301,L]


with no .css or login.php errors or image errors. I just added a "/" next to .com in RewriteCond. can this be used?


So in the sitemap.php URLs eventually point to http://example.com/phpbb/viewforum.php?f=63 do these get indexed even though we have viewforum.php and viewtopic.php blocked in robots.txt?
User avatar
Peter77
phpBB SEO Team
phpBB SEO Team
 
Posts: 520
Joined: Wed May 10, 2006 9:46 am
Location: Michigan

Postby dcz » Sun May 21, 2006 10:10 am

Peter77 wrote:I think this works!


ourra :D

I told you there must be a working solution for your case. My problem in this particular case is my hosting is just to good and accepts everything that is supposed to work ;) No problem at all with the "/".

Peter77 wrote:So in the sitemap.php URLs eventually point to http://example.com/phpbb/viewforum.php?f=63 do these get indexed even though we have viewforum.php and viewtopic.php blocked in robots.txt?


No links should not use the www prefix in your case. So i don't get it.

The sitemap.php is supposed to output rewritten URLs following your standard for Google. (look up in the contrib flder of the release (RE DL it, since there once was some errors with the premoded files) and use the correct files for your case)

With the www trick, URL like this one http://example.com/phpbb/viewforum.php?f=63 will just be redirected to http://www.example.com/phpbb/viewforum.php?f=63 and user won't even notice. And it should be the same with mod Rewritten URLs.
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 19930
Joined: Fri Apr 28, 2006 9:03 pm

Postby Peter77 » Sun May 21, 2006 11:01 am

Sorry, what I meant was, that I happened to look at my sitemap.php and I see http://www.example.com/forum-sitemap-63.xml

I then paste and copy that to my browser and it takes me to another set of URL's but this time with viewforum.php at the end.
http://www.example.com/sucasa/viewforum.php?f=63

and Since I have viewforum.php blocked in robots.txt I wonderd if Google will be able to view or index it. but then you say that there was some errors in the release of the sitemap mod? so ok.. I will re download and find the changes.


Thanks!
User avatar
Peter77
phpBB SEO Team
phpBB SEO Team
 
Posts: 520
Joined: Wed May 10, 2006 9:46 am
Location: Michigan

Postby dcz » Sun May 21, 2006 11:29 am

I am a bit confused here, because when I try to load you web site's url without the www, it works without adding the www.

Then, when I load your Google sitemap url, the output is not exactly the same as what I experience on other sites. The code is correct, but, the output is not shown the same way for the sitemapIndex than for the sitemaps.

have you changed something to the files ? Sorry, I did not opened the mx Google sitemap support thread, but if you experience further problems with this mod, just start a new thread, since this is about mod rewrite adaptation on it.

Then, you are not using the premodded files for phpBB SEO mod rewrite.
Re DL the pack and look in contrib/moded_4_mod_rewrites/phpBB_SEO_mod_Rewrites , your should use the phpBB advanced mod rewrite files instead of the regular one. This will change all your url from viewtopic.php?t=3381&start=15 to topic-title-vt3381-15.html, which are the only URL to consider in your case ;)
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 19930
Joined: Fri Apr 28, 2006 9:03 pm

Next

Return to SEO Techniques

Who is online

Users browsing this forum: Baidu [Spider], michaledoughlas and 3 guests


 
cron