[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: A (too huge) Arabic word-list (with prefixes) for spell-checkers
- To: Development Discussions <developer at arabeyes dot org>
- Subject: Re: A (too huge) Arabic word-list (with prefixes) for spell-checkers
- From: Dan Kenigsberg <danken at cs dot technion dot ac dot il>
- Date: Fri, 19 May 2006 08:52:14 +0300
- Hebrew-date: 21 Iyyar 5766
- User-agent: Mutt/1.4.1i
In my opinion, consulting a printed dictionary is perfectly legal, but IANAL.
You can download the 28Mb (compressed to 6Mb) generated list from
http://ivrix.org.il/projects/arabic/ooo-ar-0.1.zip . Note that it is in
iso8859-6, and that each word is annotated with a letter signifying its legal
prefixes.
Having a gui would be great. In fact, I created something similar for Hebrew
verbs. But it would be utterly useless without someone that cares for Arabic and
free software - to do the actual proofreading.
On Fri, May 19, 2006 at 10:26:52AM +0800, Meor Ridzuan Meor Yahaya wrote:
> Dan,
> Thanks for clarification. Just curious, how big is the file for the
> generated wordlist? I can't generate it yet because I'm working under
> XP, so I'm thinking if it is not that big, maybe I would like to
> download it instead.
> Anyway, I'm thinking, maybe it is a good idea to create a gui that
> will make it easier to go thru all the stem words and the generated
> word one by one for proofreading. It will show the stem word, the
> morphological and grammatical category applied to it, and the list of
> generated words. I think that would be easier to proofread. Or how
> about if someone compare it with Lane's Arabic lexicon (I don't have a
> copy of this)? Will it cause legal issues later? We are just
> comparing, not copying.
>
> Regards.
--
Dan Kenigsberg http://www.cs.technion.ac.il/~danken ICQ 162180901