windows-1250 and advanced mod rewrite

Discussions and support about the different URL Rewriting techniques for phpBB2.

Moderator: Moderators

windows-1250 and advanced mod rewrite

Postby kubikjuice » Wed Nov 08, 2006 5:35 pm

hi,
im using your mixed mod rewrite and everything is working fine except the accents- a i have a forum in slovak language and i get url like this:
Code: Select all
http://www.eminem.sk/slovensk-a-esk-rapperi-vf86.html

but this doesnt have much sense for me, because "slovensk-" doesnt mean anything, so it should be "slovenski"- without accent however the missing letter should be there...
thanks if someone can help me :D
User avatar
kubikjuice
 
Posts: 20
Joined: Wed Nov 08, 2006 5:13 pm
Location: Bratislava, Slovakia

Advertisement

Postby dcz » Wed Nov 08, 2006 9:46 pm

And welcome :D

This is because of the Zillion char-set out there.

You seem to be referring to a special accent that could be used in topic titles.

Usually those special characters get converted in HTML ASCII by the phpBB posting process, so we need to implement an extra filter for it in format_url.

Could you post here a copy/past from this character (small and capital) and describe it a bit. It should be outputted ok in the post.
Tell me as well if it would be correct to just change it to an regular "i" in url, and we'll fix this soon ;)

About the eventual duplicates with previously not properly handled urls, I'll PM you the zero duplicate :D

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21376
Joined: Fri Apr 28, 2006 9:03 pm

Postby kubikjuice » Thu Nov 09, 2006 11:52 am

thanks for trying to help me, e.g. www.pcforum.sk does have the accenst right (not a "-" instead a letter with an accent but e.g. instead of "á" an "a") an there are more letters with accents, heres the list:


č (id like a "c" instead)
ť (t)
á (a)
é (e)
í (i)
ó (o)
ú (u)
ý (y)
ô (o)
ž (z)
ň (n)
ľ (l)

there are some more but i dont remember all
again, thanks for all help :D
User avatar
kubikjuice
 
Posts: 20
Joined: Wed Nov 08, 2006 5:13 pm
Location: Bratislava, Slovakia

Postby dcz » Thu Nov 09, 2006 10:01 pm

For pcforum.sk, the difference come from the char-set :

Code: Select all
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />


while yours is :
Code: Select all
<meta http-equiv="Content-Type" content="text/html; charset=windows-1250" />


So I guess that's why.

Anyway, we'll have to perform few test to find out how to deal with this particular set up.

The thing is most of the characters you're listing here should be filtered by this lines of code in includes/functions.php (format_url() ) :
Code: Select all
   $find = "ÀÁÂÃÅàáâãåÒÓÔÕØòóôõøÈÉÊËèéêëÇçÌÍÎÏìíîïÙÚÛùúûÿÑñ";
   $replace = "AAAAAaaaaaOOOOOoooooEEEEeeeeCcIIIIiiiiUUUuuuyNn";


The logic is simple, each letter from the $find line will be replaced by the one being at the same position in the $replace line.

As you can see, "à" should already be turned to "a". So we can try to work on several things to fix this.

First on would be if you try to make sure you function.php file is itself using windows-1250, your favorite text editor should be able to find this out and eventually convert it.

Then, if not enough, we would need to check if, but it does not seem like it, some HTML ASCII is not involved and then filter it properly.

Ultimately, if not enough, this would require to use mb_internal_encoding() and try to set the char-set at the server level as well, as it could as well be the cause for this.

Because for sure, this is all about Char-set.

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21376
Joined: Fri Apr 28, 2006 9:03 pm

Postby kubikjuice » Fri Nov 10, 2006 2:27 pm

i edited the functions file a lil bit, but there are still problems...
this is ok:

Code: Select all
http://www.eminem.sk/americki-rapperi-vf85.html

it was this before:

Code: Select all
http://www.eminem.sk/americk-rapperi-vf85.html


but this is wrong:

Code: Select all
http://www.eminem.sk/slovenski-a-eeski-rapperi-vf86.html

it was this before (also wrong):

Code: Select all
http://www.eminem.sk/slovensk-a-eski-rapperi-vf86.html


there should be an "c" (č) instead of "e" in "eeski"ˇ
also " ľ " isnt workin- it shows just "-"

i added this code before <?php:
Code: Select all
header("Content-type: text/html; charset=windows-1250");


and everything was same...

lines with $find and $replace variables:
Code: Select all
$find = "ÀÁÂÃÅàáâãåÒÓÔÕØòóôõøÈÉÊËèéêëÇçÌÍÎÏìíîïÙÚÛùúûÿÑñČ輾";
$replace = "AAAAAaaaaaOOOOOoooooEEEEeeeeCcIIIIiiiiUUUuuuyNnCcLl";


as you might see, i added "Čč" and "Ľľ" but nothin happened

thanks for help :D
User avatar
kubikjuice
 
Posts: 20
Joined: Wed Nov 08, 2006 5:13 pm
Location: Bratislava, Slovakia

windows 1250 and advanced mod rewrite

Postby dcz » Sat Nov 11, 2006 6:19 pm

All right, this is a bit weird because these two are part of the window 1250 char-set table, so they should be handled properly, as the other ones.

Maybe you should try to only use lower case letters in the $replace line.
As well I noticed something close to a space at the end of the $find line. You can do a backspace once before the last " without deleting the special character ľ and this could as well be the cause for our troubles here.

Something like :
Code: Select all
$find = "ÀÁÂÃÅàáâãåÒÓÔÕØòóôõøÈÉÊËèéêëČčÇçÌÍÎÏìíîïÙÚÛùúûÿÑñĽľ";
$replace = "aaaaaaaaaaooooooooooeeeeeeeecccciiiiiiiiuuuuuuynnll";


If not enough, we'll try to filter those last two differently ;)

++
Last edited by dcz on Sun Nov 12, 2006 11:29 am, edited 1 time in total.
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21376
Joined: Fri Apr 28, 2006 9:03 pm

Postby kubikjuice » Sun Nov 12, 2006 11:15 am

doesnt work :(
User avatar
kubikjuice
 
Posts: 20
Joined: Wed Nov 08, 2006 5:13 pm
Location: Bratislava, Slovakia

Postby dcz » Sun Nov 12, 2006 11:27 am

So it's only Čč and Ľľ to cause problem ?

We can try adding theses two line of code before the $find and $replace lines :

Code: Select all
   $url = str_replace (array('č','Č'), 'c', $url);
   $url = str_replace (array('ľ ','Ľ'), 'l', $url);


This means we do not need to repeat those in the $find and $replace.

Don't worry we'll make it ;)

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21376
Joined: Fri Apr 28, 2006 9:03 pm

Postby kubikjuice » Sun Nov 12, 2006 12:18 pm

works now, thanks for all help dcz :D
User avatar
kubikjuice
 
Posts: 20
Joined: Wed Nov 08, 2006 5:13 pm
Location: Bratislava, Slovakia

Postby dcz » Sun Nov 12, 2006 12:53 pm

:D You're welcome ;)

I PM you the Zero duplicate right now.

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21376
Joined: Fri Apr 28, 2006 9:03 pm


Return to phpBB2 mod Rewrite

 


  • Related topics
    Replies
    Views
    Last post

Who is online

Users browsing this forum: No registered users and 4 guests