[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Arabic Support For Aspell
- To: Kevin Atkinson <kevin at atkinson dot dhs dot org>
 
- Subject: Re: Arabic Support For Aspell
 
- From: Mohammed Elzubeir <elzubeir at arabeyes dot org>
 
- Date: Sat, 17 Apr 2004 16:01:33 +0400
 
- Cc: developer at arabeyes dot org
 
- Organization: Arabeyes Project
 
On Sat, 2004-04-17 at 14:10, Kevin Atkinson wrote:
> I searched my mailbox and found some brief discussion.  Back then Aspell
> lacked Affix support or support for Unicode.  This has now changed.
> 
Excellent ;)
> Aspell supports Unicode, but internally it is still 8-bit.  So the first 
> order of business is to establish an internal encoding.  Is iso-8859-6 
> sufficient?  If not a new character set can be made up.  You can use up to 
> 210 characters (128 upper 8-bit, 30 control, 52 Latin letters).   If you 
> could tell me what parts of the Unicode block 0600-06FF Arabic needs for 
> words I can create a mapping for you.
> 
The ISO 8859-6 [1] is sufficient for internal storage, so no 'special'
mapping is necessary.
> OK.  That looks a lot like Aspell affix code.  I believe Aspell can now 
> handle it.  However the affix data needs to be converted into a single 
> Affix file.  See 
> http://aspell.sourceforge.net/devel-doc/man/Affix-Compression.html.
> 
I just had a look at that. This looks a _lot_ of manual work on my side
at least. To give you a little background about Arabic. Have a look at
the Duali wiki page [1] to see how the dictionary data is currently used
-- and I am open to suggestions.
Regards,
Mohammed Elzubeir
[1] http://wiki.arabeyes.org/Duali