phpbb seo why not support unicode?

Discussions and support about the different URL Rewriting techniques for phpBB.

Moderator: Moderators

phpbb seo why not support unicode?

Postby h4ck3r » Mon Jun 08, 2009 4:50 am

Joomla, wordpress are support unicode in seo url, why not in phpbb seo?
h4ck3r
 
Posts: 9
Joined: Sat Jun 06, 2009 4:19 am

Advertisement

Re: phpbb seo why not support unicode?

Postby dcz » Mon Jun 08, 2009 8:58 am

Because from our experience, not all browser will handle that properly, and utf-8 urlencoded urls can end up being very very long.

You can though tweak the phpbb_seo::format_url method (in phpbb_seo_class.php) to just use the title without formatting it (or with very little formatting, like getting rid of spaces), you'll have all chars included in urls this way, but I would not recommend it.
It would additionally require to tweak all rewriterules, replacing the [a-z0-9_-] with . (a dot).

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21376
Joined: Fri Apr 28, 2006 9:03 pm

Re: phpbb seo why not support unicode?

Postby whocarez » Mon Jul 19, 2010 9:52 am

I tried to use cyrillic letters with seo-mod 0.6.4 and phpbb3 3.0.7-pl1 on nginx/0.7.64:

I changed in phpbb_seo_class.php
this line
Code: Select all
$this->RegEx['url_find'] = array('`&([a-z]+)(acute|grave|circ|cedil|tilde|uml|lig|ring|caron|slash);`i', '`&(amp;)?[^;]+;`i', '`[^a-z0-9]`i');

to this
Code: Select all
$this->RegEx['url_find'] = array('`&([a-zа-я]+)(acute|grave|circ|cedil|tilde|uml|lig|ring|caron|slash);`i', '`&(amp;)?[^;]+;`i', '`[^a-zа-я0-9]`i');

and added also to the rewrite rules in .htaccess "а-я".
In the end I get urls like that:
"%D0-%D1-%D0%BD%D0%BE%D0%BA"
and that:
"�-�-нок-�-ел�-�"
So I certainly have to change this part of
phpbb_seo_class.php
Code: Select all
   function seo_url_encode( $url ) {
      // can be faster to return $url directly if you do not allow more chars than
      // [a-zA-Z0-9_\.-] in your usernames
      // return $url;
      // Here we handle the "&", "/", "+" and "#" case proper ( http://www.php.net/urlencode => http://issues.apache.org/bugzilla/show_bug.cgi?id=34602 )
      static $find = array('&', '/', '#', '+');
      static $replace = array('%26', '%2F', '%23', '%2b');
      return rawurlencode(str_replace( $find, $replace, utf8_normalize_nfc(htmlspecialchars_decode(str_replace('&', '%26', rawurldecode($url))))));
   }


If it is right: Where to get the proper codes for cyrillic letters?
If not: What can I else do to get proper cyrillic links ? :-)
whocarez
 
Posts: 16
Joined: Mon Mar 29, 2010 6:36 pm

Re: phpbb seo why not support unicode?

Postby whocarez » Mon Jul 19, 2010 12:01 pm

I found a solution:
Instead of using "a-я" I had to use "\x7f-\xff"

Code: Select all
    $this->RegEx['url_find'] = array('`&([a-z\x7f-\xff]+)(acute|grave|circ|cedil|tilde|uml|lig|ring|caron|slash);`i', '`&(amp;)?[^;]+;`i', '`[^a-z\x7f-\xff0-9]`i');

and the same for .htaccess ...

it works here with chrome, IE, FF, Safari and Opera under Linux and Windows ....
whocarez
 
Posts: 16
Joined: Mon Mar 29, 2010 6:36 pm


Return to phpBB mod Rewrite

 


  • Related topics
    Replies
    Views
    Last post

Who is online

Users browsing this forum: No registered users and 11 guests