mixed rewrite mod and persian UTF-8 encoding

Discussions and support about the different URL Rewriting techniques for phpBB2.

Moderator: Moderators

mixed rewrite mod and persian UTF-8 encoding

Postby AmirAbbas » Thu Jul 13, 2006 9:58 am

hello

i decided to install mixed rewrite mod in my forum
persian language is not supported in mixed rewrite mod
but i can use english titles for my forums

i want to know can it be effective for my language ?
my forum is a forum about webdesign and i will use many english word
in my posts (at least in web design forums i will use lot of english words)
but in other part of forum like free descussion forum there isn't any english word
and another question

mixed rewrite mod only affect on forums URL.
topics URL are like simple rewrite mod
with this mod only few URL will change ( i mean only forums URL is different from simple rewrite mod )
but you said the effect of this mod in indexing is very good :roll:
User avatar
AmirAbbas
phpBB SEO Team
phpBB SEO Team
 
Posts: 534
Joined: Thu May 11, 2006 3:30 pm
Location: IRAN

Advertisement

Postby dcz » Thu Jul 13, 2006 2:36 pm

Well, off course it's a cool thing to use good keywords in URLs linked to related content.

There would be no real point to inject English keywords in URLs while the site never uses it.

And the mixed mod Rewrite will as well inject categories Titles in URLs, which is another occasion to show keyword in URLs.

Then, there is something I must tell you before we proceed, is that I never installed a phpBB forum running UTF-8, and I know few things about char-sets, I must admit :roll:

But, if you can set up a local test server, it won't be long to see how we could tweak the format_url() function to deal with such settings and if it's possible to mix Persian and English words in Forum title.

I say this because it seems to me that php does not care about utf-8 or Latin for most Latin chars, as all end up coded in ASCII and many characters are bi compatible, but not all.
SO for English words it's no problem I think if they are coded in UTF8, but with Persian I don't know.

We need to test.

Install a local test board and try it as is, tell me what happens to titles, English, mixed English and Persian, and Persian only.

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21378
Joined: Fri Apr 28, 2006 9:03 pm

Postby AmirAbbas » Thu Jul 13, 2006 3:43 pm

i mean that i want use only english word in title
just english


i tested migration of persian and english words
persian words will replaced with u character

for example i used a persian and a english word

Code: Select all
طراحی web


it means web design

the URL of this forum is in this form

http://localhost/Persia/u-web-vf2.html

persian word replaced by U character

movabletype use a method like this
if you have a english word, MT will add this word to title
if there isn't any english word , MT add post****.html to address
User avatar
AmirAbbas
phpBB SEO Team
phpBB SEO Team
 
Posts: 534
Joined: Thu May 11, 2006 3:30 pm
Location: IRAN

Postby dcz » Thu Jul 13, 2006 3:52 pm

Then it seems it working right away, if you only use English words.

But I think we could then try to totally filter Persian words so that you still can use a good Persian Keyword in the forum titles, they'll be associated with the URL as being part of the txt link, which can be a good thing.


If you are interested, try those three cases :

English-Persian and English-Persian-English to see if they always get replaced by u, and try to many cases, more than one word etc to see if special case can be found.

Any way, it's no big deal, because even if all cases cannot be handled if you mix Persian and English, then, we'll for sure be able to mixe them nicely a long as we avoid exceptions (if any).

Because it's only the admin here to choose :D

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21378
Joined: Fri Apr 28, 2006 9:03 pm

Postby AmirAbbas » Thu Jul 13, 2006 4:33 pm

i used combination of persian and english words
see this titl and urls

the titles shows in LTR direction. if you want to see the title in correct way please
copy them and paste them in windows notepad, then right click on notepad and choose "right to left reading order"

as you can see some persian word will come together and the hyphen between them remove


Code: Select all
سایت persia طراح وب

http://localhost/Persia/u-persia-u-vf1.html



Code: Select all
weblog سایت طراح وب برای ایرانیان

http://localhost/Persia/weblog-u-u-u-uuuu-vf3.html



Code: Select all
سایت طراحی

http://localhost/Persia/u-u-vf4.html




Code: Select all
سایت طراحی وب

http://localhost/Persia/u-u-u-vf5.html



Code: Select all
سایت طراحی web

http://localhost/Persia/u-u-web-vf6.html




Code: Select all
سایت web design

http://localhost/Persia/u-web-design-vf7.html




Code: Select all
سایت persia طراح

http://localhost/Persia/u-persia-vf8.html



Code: Select all
سایت persia طراح وب ایران

http://localhost/Persia/u-persia-u-uu-vf9.html



Code: Select all
سایت persia طراح وب برای ایرانیان در هر کجای ایران

http://localhost/Persia/u-persia-u-u-uuuu-u-uu-uu-vf10.html




Code: Select all
persian sites مجموعه بهترین ها persian hostings

http://localhost/Persia/persian-sites-uuuu-uuu-u-persian-hostings-vf11.html


is there anyway to filter persian words in URLs completely ?
movable type use a method like this

for example i write a article about CSS that has other persian word in title
something like this


Code: Select all
آموزش CSS از استاد eric meyer


the url for this page in MT is like this

www.example.com/CSS_eric_meyer.html

MT ignore persian words

now imagine that there isn't any english words in title
something like this

Code: Select all
آموزش طراحی وب توسط امیر عباس


now the URL for this page in mt is like this

www.example.com/post****.html (**** is the number of post)

can you make something like this for phpbb SEO rewrites mod
both for mixed and advance rewrite mod




:wink:
User avatar
AmirAbbas
phpBB SEO Team
phpBB SEO Team
 
Posts: 534
Joined: Thu May 11, 2006 3:30 pm
Location: IRAN

Postby dcz » Thu Jul 13, 2006 4:51 pm

Movable type is not GPL, so I cannot inspire ;)

Then, I was thinking about two path to go for.

The fast and easy one, where we just filter the first "u-" for example, or one or two more simple case allowing you to build simple mixed titles right now, or, all of them if you manage not to use any and prefer to ad many Persian words in forum titles.

Then, to think about a more general method, it's true I could implement some code to check if the url is empty and if so statically rewrite it, it's a good idea.

Just need to figure out how to deal with all cases while filtering titles, could just be a

Code: Select all
utf8_decode($url);


just after :
Code: Select all
$amp = ($non_html_amp) ? '&' : '&';


in sessions.php to do the trick. I cannot test, but if this outputs characters like ? and etc in Latin char-set, then, they'll be filtered and we'll only have to deal with the empty case to be ready for utf-8 advanced mod rewrite ;)

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21378
Joined: Fri Apr 28, 2006 9:03 pm

Postby AmirAbbas » Fri Jul 14, 2006 8:53 am

OK

i installed advance rewrite mod today
my phpBB installed in folder with name persia
the URL for this title
Code: Select all
weblog سایت طراح وب برای ایرانیان

is like this

Code: Select all
http://localhost/Persia/weblog-o-o-u-o-o-o-o-o-u-o-o-o-o-u-o-u-o-o-u-u-o-u-vf3.html


all persian character replaced with O and U character
another one title

Code: Select all
برای تست کردن برنامه


and the URL

Code: Select all
http://localhost/Persia/o-o-o-u-o-o-o-u-o-o-u-o-o-u-o-u-u-vt11.html


after that i add

Code: Select all
utf8_decode($url);


after

Code: Select all
$amp = ($non_html_amp) ? '&' : '&';


in sessions.php file

nothing happend

the URLs are like URLs i mentioned above

now, could you filter persian character ? for example for this title

Code: Select all
آموزش برنامه نویسی php از سایت lynda


the url must be something like this

Code: Select all
http://localhost/Persia/php-lynda.html


and if a title doesn't have any english word the url must be like this

Code: Select all
http://localhost/Persia/topic1256.html


is it possible ?
if you make a advance rewrite mod like this
all persian, arabic, hebrew, chinese, japanese and ... forums can use this rewrite mod
:wink:
User avatar
AmirAbbas
phpBB SEO Team
phpBB SEO Team
 
Posts: 534
Joined: Thu May 11, 2006 3:30 pm
Location: IRAN

Postby dcz » Fri Jul 14, 2006 9:19 am

amir abbas wrote:now, could you filter Persian character ? for example for this title


I am sure it's possible I'll take a look at it.

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21378
Joined: Fri Apr 28, 2006 9:03 pm

Postby AmirAbbas » Fri Jul 14, 2006 9:21 am

thanks :D
User avatar
AmirAbbas
phpBB SEO Team
phpBB SEO Team
 
Posts: 534
Joined: Thu May 11, 2006 3:30 pm
Location: IRAN

Postby AmirAbbas » Sun Jul 16, 2006 6:24 am

hello

excuse me, i want to start my forum in my new domain
im willing to use advance rewrite mod and i know that you don't have time

i must wait or i can install my forum with simple rewrite mod and after that i upgrade it to advance rewrtie mod ? :wink:
User avatar
AmirAbbas
phpBB SEO Team
phpBB SEO Team
 
Posts: 534
Joined: Thu May 11, 2006 3:30 pm
Location: IRAN

Postby dcz » Sun Jul 16, 2006 11:04 am

Well it's not that easy to go from static to dynamic mod rewrite without arm, because we don't know the topic title in the .htaccess.

I'd say, wait few more days while I search for a fix ;)

++
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21378
Joined: Fri Apr 28, 2006 9:03 pm

Postby AmirAbbas » Mon Jul 17, 2006 4:06 am

OK, thank you :wink:
User avatar
AmirAbbas
phpBB SEO Team
phpBB SEO Team
 
Posts: 534
Joined: Thu May 11, 2006 3:30 pm
Location: IRAN

Postby dcz » Mon Jul 17, 2006 10:02 pm

All right, I have found what could be a solution, but there is a limitation.

With this, all numbers will be cut off the URL, which is not this bad, but it's not perfect.

the problem is طراحی web is coded like this in fact, look the html source code : 1591-1585-1575-1581-1740-web

See the point.

And char-set conversion are tricky here, I am still searching, but it's not that simple and I have few knowledge of char-sets.


So here are the required code change to do so :

Open :

[EDIT] Was not quite it it seems :D
See bellow.

++
Last edited by dcz on Wed Jul 19, 2006 8:23 am, edited 1 time in total.
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21378
Joined: Fri Apr 28, 2006 9:03 pm

Postby dcz » Tue Jul 18, 2006 10:27 am

All right I think I have got it ;)

So forget about the suggested changes and do this instead, still in the function.php code :

Find :

Code: Select all
function format_url($url)
{


After add :
Code: Select all
$url = preg_replace('/&#[0-9]{4};/i', "", $url);


And then, to take care of the empty case, when all is the title is posted in Persian, find :

Code: Select all
   return $url;


Before add :

Code: Select all
   $url = ($url == '') ? 'topic' : $url;


and as before add the topic rewriterule for the empty title case (all Persian).

Actually, this make the mod very powerful in utf-8 as, general titles are most likely not to be made with English words and will thus be rewritten statically where the mixed titles will see keywords injected in URLs.

Tell me what you think, I think it's final now :D

++

[EDIT] Yes it is function.php lol :D
Last edited by dcz on Wed Jul 19, 2006 8:22 am, edited 1 time in total.
Useful links :
SEO Forum || SEO Directory || SEO phpBB || Search
____________________

Liens Utiles :
Forum référencement || Annuaire référencement || Référencement phpBB || Recherche
dcz
Admin
Admin
 
Posts: 21378
Joined: Fri Apr 28, 2006 9:03 pm

Postby AmirAbbas » Wed Jul 19, 2006 7:30 am

ok

i searched in session.php . icouldn't find that part of codes in session.php
i think this part of code is in functions.php

i add that two line of code in functions.php
nothing happend again. i have all of that u and o in my urls :roll:

and another thing. persian characters are not only numbers

for example for this sectence


Code: Select all
سلام دوست من


HTML source is

Code: Select all
سلام دوست من


first a & after that # and a number and finally ;
User avatar
AmirAbbas
phpBB SEO Team
phpBB SEO Team
 
Posts: 534
Joined: Thu May 11, 2006 3:30 pm
Location: IRAN

Next

Return to phpBB2 mod Rewrite

 


  • Related topics
    Replies
    Views
    Last post

Who is online

Users browsing this forum: No registered users and 9 guests


 
cron