[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [aspell-user] Aspell Arabic support



>>> Kevin Atkinson 01/23/02 04:46AM wrote
>>Sorry for not responding.  When people tend to ask hard questions I tend
>>to mark the post as important with the hope I will get back to it.
>>Unfortunately I don't always.  BTW, I prefer these type of things get
>>posted to the mailing list as it will create a public recorded of our
>>conversation.
Ah, so it is a hard question ;) At least I wouldn't feel bad for not understanding
things then ;)

> Using 8-bit characters internally as far Arabic is concerned is not a
> problem, although you would gain a lot more accuracy by considering
> double-width characters.

>>How is that?
 
Hrmm.. I take that back, I wasn't thinking ;) The ISO8859-6 charset is completely
sufficient for internal storage, but not enough to display.


>>Are you talking in terms of affix compression?
Well, I am not sure how much affix compression helps with Arabic. So let me give
an example and you can tell me (since I am a little confused about that as well).
 
Let's take a verb root: ktb —> pronounced 'kataba'
 
k = root letter 1
t = root letter 2
b = root letter 3
 
In Arabic, to represent a 3-letter verb (most common, quad-lettered root verbs are
rare), we symbolize them with 'FEH','AIN','LAM' (pronounced fa3al) [where the 3 is
the ain, excuse my not-so-professional transliteration].
 
Derived from 'ktb' are many words, like:
 
mktb (prnounced 'maktab') —> represented as 'mf3l'
 
so far so good.. only adding a prefix.
 
ktabT (pronounced 'kitaba' — where T is a 'TEH MARBOOTA') —> represented as 'f3ala'
    (pronounced as 'fi3ala')
 
That adds an 'ALEF' (a) to the middle of the root verb as well as a 'TEH MARBOOTA' (suffix).
 
And the list goes on and on ;) Only for one verb. Most of the Arabic language is derived from
root verbs as this, and that is what makes it such a beautiful language (yet, so complex
when computerized) ;) Words like, office, book, writer, library, etc. are all derived from 'ktb'.
 
My question is, how does the affix compression come to play here?

>>OK. You got my attention ;)
Great, because I will need a lot of help ;) But I am very motivated and will spend hours upon
hours to get this working ;)

>>The current released version of Aspell/Pspell is now dead as far as
>>development is concerned all of the new development is talking please on
>>the "New Aspell" which can be found at http://aspell.net/.  Browse the
>>announcement archive for more information as I have not set up a real web
>>page yet and what is currently there is not up to date.
I see. It would be nice to kill the other pages or simply re-direct to aspell.net.. then
again, I found the right place, others can search <evil grin>
 
 
Thanks
Mohammed Elzubeir