[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Arabic Support For Aspell
- To: Mohammed Elzubeir <elzubeir at arabeyes dot org>
- Subject: Re: Arabic Support For Aspell
- From: Kevin Atkinson <kevin at atkinson dot dhs dot org>
- Date: Sat, 17 Apr 2004 08:19:15 -0400 (EDT)
- Cc: developer at arabeyes dot org
On Sat, 17 Apr 2004, Mohammed Elzubeir wrote:
> On Sat, 2004-04-17 at 14:10, Kevin Atkinson wrote:
>
> > I searched my mailbox and found some brief discussion. Back then Aspell
> > lacked Affix support or support for Unicode. This has now changed.
> >
>
> Excellent ;)
>
> > Aspell supports Unicode, but internally it is still 8-bit. So the first
> > order of business is to establish an internal encoding. Is iso-8859-6
> > sufficient? If not a new character set can be made up. You can use up to
> > 210 characters (128 upper 8-bit, 30 control, 52 Latin letters). If you
> > could tell me what parts of the Unicode block 0600-06FF Arabic needs for
> > words I can create a mapping for you.
> >
>
> The ISO 8859-6 [1] is sufficient for internal storage, so no 'special'
> mapping is necessary.
Ok thanks.
> > OK. That looks a lot like Aspell affix code. I believe Aspell can now
> > handle it. However the affix data needs to be converted into a single
> > Affix file. See
> > http://aspell.sourceforge.net/devel-doc/man/Affix-Compression.html.
>
> I just had a look at that. This looks a _lot_ of manual work on my side
> at least. To give you a little background about Arabic. Have a look at
> the Duali wiki page [1] to see how the dictionary data is currently used
> -- and I am open to suggestions.
Well. Maybe you can automate it. I am not willing to add special code
for Arabic if the current code is sufficient. Sorry.
--
http://kevin.atkinson.dhs.org