To: General Arabization Discussion <general at arabeyes dot org>
Subject: Re: Arabization, techniques and problems
From: Gregg Reynolds <gar at arabink dot com>
Date: Sun, 10 Jul 2005 08:13:11 -0500
User-agent: Mozilla Thunderbird 1.0.2 (Windows/20050317)
maysara a wrote:
Salam all,
Im trying to initiate an open source community in my
university, and i need to intriduce students to open
source and to contribute to already existing software
and documentations and other.
You might want to take a look at kdiff3
(http://kdiff3.sourceforge.net/). Late last fall I was looking for a
diff tool that would work for Arabic, but I couldn't find one. Some
tools can do the diff but can't display it properly. So I contacted the
developer of kdiff3 and he agreed to add Arabic support, just because he
found it an interesting challenge. He knows no Arabic. In return, I'm
supposed to test it and also translate the docs to Arabic.
The current status is that he has implemented utf-8 support and RTL
display support, but not yet shaping support (contact him to find out
for sure). He is looking for a chunk of existing code to use for
shaping support.
Maybe your students could take responsibility for implementing shaping
logic. This is actually kind of an interesting project, because you
have to figure out a good way to show differences involving diacritics.
You also need to decide if you want to do anything Arabic-specific;
for example, "ktb" and "kataba" are different orthographically, but
equivalent semantically. Should a diff engine be aware of such
considerations? I think it would be useful. Maybe there should be a
switch that means "ignore diacritics and show mean only differences in
huruuf". It's a design challenge as well as an implementation challenge.
Since the developer knows no Arabic, feel free to have your students
contact me if they want to bounce around ideas. They can write me in
English or Arabic.
As for testing and translating, I'm afraid I've only done a little
testing and no translating. If your students would like to help, they
could create a suite of test cases. Every combination of
joining/nonjoining characters with and without diacritics needs to be
tested to make sure the shaping and display logic is correct.
Help with translating would also be appreciated; in fact since I'm not a
native speaker of Arabic your students would undoubtedly do a better
translation than me.
Also consider that a diff tool is an essential part of a developer's
toolbox, and very useful for ordinary writers.