[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Fwd: Arabic Spellchecker for OpenOffice



Hi Ahmad, thanks for the quick response. I already have such script and the necessary word list developed. I can share this with you guys if you want to go that route. If you have resources that would help in developing an alternate method let me know. I will prepare the necessary list and send it to you by Monday.

Jabra

On 3/23/06, Ahmad Khalifa <ahmad at khalifa dot ws> wrote:

Quoting Youcef Rahal <y dot rahal at gmail dot com>:
> From: Jabs <jabrafghneim at gmail dot com>
> Hi, what kind of help do you need to build an Arabic spellchecker for
> OpenOffice 2.0? Let me know.
> Jabra

I've done some investigation in this topic but unfortunately didn't
get around to creating a feasible solution.

Mohamed Sameer was also atempting to tackle this, check out
http://baghdad.foolab.org/

I've written a tiny paper on 1 approach available, and the one I saw
best to follow, after some recomendations from people around here
http://khalifa.ws/files/public/arabic-dictionary.txt

My plan was to create a dictionary called 'AFFIX Dictionary' to be used
with popular spell checkers like Aspell, and OpenOffice.org, unfortunately
at first I started working less and less on it, until I stopped.

If you'd be willing to work on the Affix dictionary, I'd be willing to
give you all the help I can.

Other than the AFFIX dictionary approach, there is a quick and dirty
solution, which is, to create a text file containing a list of arabic words
and feeding that to Aspell/Myspell. That dictionary would be very slow for
Aspell/Myspell but would work.

To get such a file, you could write a nice script with your favourite
language to parse some data source and output all distinct words in 1 output
file, with each word on a line of its own.
Some of the known datasources around, are Arabeyes' Wordlist project, and
Duali's Buckwalter dataset.

Let me know if you're interested on the 'developer' mailing list at
Arabeyes.org
http://lists.arabeyes.org/mailman/listinfo/developer


--
Salam,
Ahmad Khalifa




--
Fortuna ventus mens ut est paratus.
************************************************************************************
Language Hacker. Creator of Localized or translated culturally aware services, content, technology, training and experiences targeted to individuals and professional services firms with projects related to the Middle East region.