[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Fwd: Arabic Spellchecker for OpenOffice




Quoting Youcef Rahal <y dot rahal at gmail dot com>:
From: Jabs <jabrafghneim at gmail dot com>
Hi, what kind of help do you need to build an Arabic spellchecker for
OpenOffice 2.0? Let me know.
Jabra

I've done some investigation in this topic but unfortunately didn't get around to creating a feasible solution.

Mohamed Sameer was also atempting to tackle this, check out
http://baghdad.foolab.org/

I've written a tiny paper on 1 approach available, and the one I saw
best to follow, after some recomendations from people around here
http://khalifa.ws/files/public/arabic-dictionary.txt

My plan was to create a dictionary called 'AFFIX Dictionary' to be used
with popular spell checkers like Aspell, and OpenOffice.org, unfortunately
at first I started working less and less on it, until I stopped.

If you'd be willing to work on the Affix dictionary, I'd be willing to
give you all the help I can.

Other than the AFFIX dictionary approach, there is a quick and dirty
solution, which is, to create a text file containing a list of arabic words
and feeding that to Aspell/Myspell. That dictionary would be very slow for
Aspell/Myspell but would work.

To get such a file, you could write a nice script with your favourite
language to parse some data source and output all distinct words in 1 output
file, with each word on a line of its own.
Some of the known datasources around, are Arabeyes' Wordlist project, and
Duali's Buckwalter dataset.

Let me know if you're interested on the 'developer' mailing list at Arabeyes.org
http://lists.arabeyes.org/mailman/listinfo/developer



-- Salam, Ahmad Khalifa