About special characters remplacement

phpBB3 SEO Advanced mod Rewrite support forum.
This mods performs URL rewriting for phpBB, injecting forums and topic titles in their URLs.

Moderator: Moderators

Postby Lesiu » Mon Mar 03, 2008 11:47 am

I wrote some information and instructions for polish people. You can read it here: przyjazne adresy w phpBB3.
Lesiu
 
Posts: 2
Joined: Mon Jan 14, 2008 4:17 pm

Advertisement

Postby dcz » Tue Mar 11, 2008 10:04 am

Great job :D

I'm pretty sure though that the way I started to implement custom replacement in this thread, all done in one str_replace() call is faster than the way you did.

Yours is working of course, but you may want to consider this to provide an even more efficient solution.

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21325
Joined: Fri Apr 28, 2006 9:03 pm

Postby Bembis » Sun May 25, 2008 2:23 pm

Code: Select all
   function format_url( $url, $type = 'topic' ) {
      $url = preg_replace('`\[.*\]`U','',$url);
      $url = htmlentities($url, ENT_COMPAT, $this->encoding);
      $url = str_replace( array('č', 'Č'),'c', $url );
        $url = str_replace( array('ą', 'Ą'),'a', $url );
        $url = str_replace( array('ė', 'ė'),'e', $url );
        $url = str_replace( array('ę', 'Ę'),'e', $url );
        $url = str_replace( array('į', 'Į'),'i', $url );
        $url = str_replace( array('š', 'Š'),'s', $url );
        $url = str_replace( array('ų', 'Ų'),'u', $url );
        $url = str_replace( array('ū', 'Ū'),'u', $url );
        $url = str_replace( array('ž', 'Ž'),'z', $url );
        $url = preg_replace( '`&([a-z]+)(acute|uml|circ|grave|ring|cedil|slash|tilde|caron|lig);`i', "\\1", $url );
      $url = preg_replace( $this->seo_opt['url_pattern'] , '-', $url);
      $url = strtolower(trim($url, '-'));
      return empty($url) ? $type : $url;
   }
dont work :|
Bembis
 
Posts: 13
Joined: Sun May 25, 2008 9:37 am

Postby SeO » Sun May 25, 2008 2:35 pm

you need to apply the method described in this post : http://www.phpbb-seo.com/boards/advance ... .html#9596

The thread isn't that long, you'll find code for more chars in the following posts.
SeO
Admin
Admin
 
Posts: 6334
Joined: Wed Mar 15, 2006 9:41 pm

Postby Bembis » Mon May 26, 2008 8:37 am

SeO wrote:http://phpbb3.phpbb-seo.net/another-test-forum-f4/cc-tt-dd-ll-nn-zz-ss-t42.html

žŽ needed an extra replace, šŠ is handled by htmlentities().

So just add :

Code: Select all
      // --> Custom str_Replace arrays, to handle special cases properly
      $this->seo_opt['url_find'] = array(utf8_chr(268),utf8_chr(269),  // c
         utf8_chr(356),utf8_chr(357), // t
         utf8_chr(270),utf8_chr(271), // d
         utf8_chr(317),utf8_chr(318), // l
         utf8_chr(327),utf8_chr(328), // n
         utf8_chr(381),utf8_chr(382), // z
      );
      $this->seo_opt['url_replace'] = array('c', 'c', 't', 't', 'd', 'd', 'l', 'l','n', 'n', 'z', 'z');


Instead of the code you mentionned.

;)

I need ą Ą to a Šš to s ęĘ to e įĮ to i šŠ to s ųŲ to u ūŪ to u ėĖ to e ęĘ to e :roll: :roll:
Bembis
 
Posts: 13
Joined: Sun May 25, 2008 9:37 am

Postby Bembis » Tue May 27, 2008 12:05 pm

Somebody Help me :roll:
Bembis
 
Posts: 13
Joined: Sun May 25, 2008 9:37 am

Postby SeO » Tue May 27, 2008 12:21 pm

Bembis wrote:I need ą Ą to a Šš to s ęĘ to e įĮ to i šŠ to s ųŲ to u ūŪ to u ėĖ to e ęĘ to e :roll: :roll:


Are these the only replacement you need ?

Because, I have few knowledge about these chars and the language they are used in.

From there I could give you an hand for coding ;)
SeO
Admin
Admin
 
Posts: 6334
Joined: Wed Mar 15, 2006 9:41 pm

Postby Bembis » Tue May 27, 2008 1:10 pm

Yes only this Ąą - a čČ-c ęĘ-e ėĖ-e Įį-i Šš-s Ųų-u Ūū-u Žž-z this is Lithuanian language charakters :)
Bembis
 
Posts: 13
Joined: Sun May 25, 2008 9:37 am

Postby Bembis » Wed May 28, 2008 3:58 pm

so ?
Bembis
 
Posts: 13
Joined: Sun May 25, 2008 9:37 am

Postby SeO » Thu May 29, 2008 9:09 pm

Bembis wrote:so ?


well, this is not the message that gave me the most urge to answer ever, but ho well :roll:

Using http://www.tony-franks.co.uk/UTF-8.htm I'd get :

Code: Select all
      // --> Custom str_Replace arrays, to handle special cases properly
      $this->seo_opt['url_find'] = array(utf8_chr(260),utf8_chr(261),  // ą Ą -a
         utf8_chr(268),utf8_chr(269), // čČ-c
         utf8_chr(280),utf8_chr(281), // ęĘ-e
         utf8_chr(278),utf8_chr(279), // ėĖ-e
         utf8_chr(302),utf8_chr(303), // Įį-i
         utf8_chr(138),utf8_chr(154), utf8_chr(352),utf8_chr(353), // Šš-s
         utf8_chr(370),utf8_chr(371), // Ųų-u
         utf8_chr(362),utf8_chr(363), // Ūū-u
         utf8_chr(142),utf8_chr(158), utf8_chr(381),utf8_chr(382), // Žž-z
      );
      $this->seo_opt['url_replace'] = array('a', 'a', 'c', 'c', 'e', 'e', 'e', 'e','i', 'i', 's', 's', 's', 's', 'u', 'u', 'u', 'u', 'z', 'z', 'z', 'z');


Note that for š and ž, both upper and lower case, there was more than on set of entry in the table, I implemented them all, since I'm not really sure what would make someone use one set or another, this way, it should work in all cases.
SeO
Admin
Admin
 
Posts: 6334
Joined: Wed Mar 15, 2006 9:41 pm

Postby IPB_Refugee » Sat Jul 26, 2008 5:42 pm

Hello SeO & dcz,

I wanna thank you very much for your wonderful work. I just installed your Advanced SEO URL MOD and until now (haven't tested it too much yet) it works very well. :)

Here is my version chiefly to handle german umlauts:

Code: Select all
      // --> Custom str_Replace arrays, to handle special cases properly
      $this->seo_opt['url_find'] = array(
         utf8_chr(196),utf8_chr(228), // ä
         utf8_chr(214),utf8_chr(246), // ö
         utf8_chr(220),utf8_chr(252), // ü
         utf8_chr(223), // ß
         utf8_chr(39), utf8_chr(180), // '´
      );
      $this->seo_opt['url_replace'] = array('ae', 'ae', 'oe', 'oe', 'ue', 'ue', 'ss', '', '');


I think you really should tell about this thread in your install file or, maybe even better, provide a guide how to handle special chars directly in the installation file.

Thanks again & greetings from Austria!
Wolfgang
User avatar
IPB_Refugee
PR0
PR0
 
Posts: 82
Joined: Thu Jul 24, 2008 2:18 pm

Postby dcz » Sun Jul 27, 2008 7:10 am

Great :D

Providing some good documentation about mod_rewrite internationalization is close to the top of our todo list, but we need to convert phpBB SEO to phpBB3 first, should occur this summer ;)

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21325
Joined: Fri Apr 28, 2006 9:03 pm

Postby MGLorencin » Tue Sep 23, 2008 1:57 pm

I also had to apply some changes to suit my needs. Following code works for this characters.

šŠđĐžŽčČćĆ

Code: Select all
// --> Custom str_Replace arrays, to handle special cases properly
         $this->seo_opt['url_find'] = array(
       utf8_chr(353),utf8_chr(352), // š
         utf8_chr(273),utf8_chr(272), // đ
         utf8_chr(382),utf8_chr(381), // ž
         utf8_chr(269),utf8_chr(268), // č
         utf8_chr(263),utf8_chr(262), // ć
      );
      $this->seo_opt['url_replace'] = array('s', 's', 'd', 'd', 'z', 'z', 'c', 'c','c', 'c');
      // Array of the filenames that may require the use of a base href tag.


I hope it will help someone in need. It's tested and working ...I probably didn't need to find replacement for character š but just in case I did it for all characters I needed.
MGLorencin
 
Posts: 17
Joined: Sat Sep 20, 2008 11:37 am

[hepl] Vietnamese characters

Postby NguyTieuNhan » Mon Nov 24, 2008 10:16 am

how can I replace letters like

ă,ắ,ằ,ẳ,ặ,ã,á, à, ạ, ả, â, ấ,ầ, ẩ, ậ, ẫ => a
ú, ù, ủ, ũ, ụ => u
ì, ị, í, ỉ, ĩ => i
ế, ề, ể, ễ, ệ => e
ò, ó, ỏ, õ, ọ, ô, ồ, ố, ỗ, ộ, ổ => o


Please help me, I can not be transferred

Thank you for your attention!
NguyTieuNhan
 
Posts: 13
Joined: Mon Nov 24, 2008 2:42 am

Re: [hepl] Vietnamese characters

Postby SeO » Thu Nov 27, 2008 7:30 pm

If you preview your char list, you'll see this (without the spaces between the &# and the digits :

Code: Select all
&# 259;,&# 7855;,&# 7857;,&# 7859;,&# 7863;,ã,á, à, &# 7841;, &# 7843;, â, &# 7845;,&# 7847;, &# 7849;, &# 7853;, &# 7851; => a
ú, ù, &# 7911;, &# 361;, &# 7909; => u
ì, &# 7883;, í, &# 7881;, &# 297; => i
&# 7871;, &# 7873;, &# 7875;, &# 7877;, &# 7879; => e
ò, ó, &# 7887;, õ, &# 7885;, ô, &# 7891;, &# 7889;, &# 7895;, &# 7897;, &# 7893; => o


This is handy to find out the utf-8 numerical value of multi byte characters.

So for the three first leters, you'd need :
Code: Select all
// --> Custom str_Replace arrays, to handle special cases properly
         $this->seo_opt['url_find'] = array(
       utf8_chr(259), utf8_chr(7855), utf8_chr(7857), utf8_chr(7859), utf8_chr(7863), utf8_chr(7841), utf8_chr(7843), utf8_chr(7845), utf8_chr(7847), utf8_chr(7849), utf8_chr(7853), utf8_chr(7851), // ă,ắ,ằ,ẳ,ặ, ạ, ả, ấ,ầ, ẩ, ậ, ẫ
         utf8_chr(7911), utf8_chr(361), utf8_chr(7909), // ủ, ũ, ụ
         utf8_chr(236), utf8_chr(7883), utf8_chr(237), utf8_chr(7881), utf8_chr(297),// ì, ị, í, ỉ, ĩ
// ++ the e and o case
      );
      $this->seo_opt['url_replace'] = array('a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'u', 'u', 'u', 'i', 'i', 'i', 'i', 'i');
      // Array of the filenames that may require the use of a base href tag.


The principle is simple, one utf8_chr() calls per character needing it, and then as many replacement letter in the $this->seo_opt['url_replace'] array.
For letters that are not coded, eg that can be displayed in latin1 and that would still ne be properly treated as you wish, you can use this table http://www.tony-franks.co.uk/UTF-8.htm to find the utf8 numerical code to use.
Now, usually, you'd have to add upper cases as well, but since you're replacing many chars, a call to utf8_strtolower() would be easier, so just replace :
Code: Select all
      $url = preg_replace('`\[.*\]`U','',$url);

with :
Code: Select all
      $url = preg_replace('`\[.*\]`U','',utf8_strtolower($url));

and :
Code: Select all
      $url = strtolower(trim($url, '-'));

with :
Code: Select all
      $url = trim($url, '-');


in phpbb_seo/phpbb_seo_class.php and only consider lower cases characters in the replacement code.

Tell me if you need more help, I waz too lazy to search for all the numerical values tonight ;)
SeO
Admin
Admin
 
Posts: 6334
Joined: Wed Mar 15, 2006 9:41 pm

PreviousNext

Return to Advanced SEO URL

 


  • Related topics
    Replies
    Views
    Last post

Who is online

Users browsing this forum: No registered users and 12 guests