[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Arabization, techniques and problems



maysara a wrote:
Salam all,

Im trying to initiate an open source community in my
university, and i need to intriduce students to open
source and to contribute to already existing software
and documentations and other.


You might want to take a look at kdiff3 (http://kdiff3.sourceforge.net/). Late last fall I was looking for a diff tool that would work for Arabic, but I couldn't find one. Some tools can do the diff but can't display it properly. So I contacted the developer of kdiff3 and he agreed to add Arabic support, just because he found it an interesting challenge. He knows no Arabic. In return, I'm supposed to test it and also translate the docs to Arabic.


The current status is that he has implemented utf-8 support and RTL display support, but not yet shaping support (contact him to find out for sure). He is looking for a chunk of existing code to use for shaping support.

Maybe your students could take responsibility for implementing shaping logic. This is actually kind of an interesting project, because you have to figure out a good way to show differences involving diacritics. You also need to decide if you want to do anything Arabic-specific; for example, "ktb" and "kataba" are different orthographically, but equivalent semantically. Should a diff engine be aware of such considerations? I think it would be useful. Maybe there should be a switch that means "ignore diacritics and show mean only differences in huruuf". It's a design challenge as well as an implementation challenge.

Since the developer knows no Arabic, feel free to have your students contact me if they want to bounce around ideas. They can write me in English or Arabic.

As for testing and translating, I'm afraid I've only done a little testing and no translating. If your students would like to help, they could create a suite of test cases. Every combination of joining/nonjoining characters with and without diacritics needs to be tested to make sure the shaping and display logic is correct.

Help with translating would also be appreciated; in fact since I'm not a native speaker of Arabic your students would undoubtedly do a better translation than me.

Also consider that a diff tool is an essential part of a developer's toolbox, and very useful for ordinary writers.

Hope that helps,

gregg