[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: arabic ispell
- To: general at arabeyes dot org, "Nadav Har'El" <nyh at math dot technion dot ac dot il>
- Subject: Re: arabic ispell
- From: Isam Bayazidi <bayazidi at accessme dot com>
- Date: Thu, 18 Oct 2001 16:18:13 +0200
- Cc: Geoff Kuenning <geoff at cs dot hmc dot edu>
On Thursday 18 October 2001 2:08pm, Nadav Har'El wrote:
> Unlike languages like (say) English, where you have a very small number of
> forms to each noun or verb, Hebrew and Arabic have complex conjugation and
> inflection rules based on 3-letter "roots", rather than basewords and
> affixes. Creating a word list by hand, where each root will have dozens of
> words derived from it, is a sure way to make mistakes and forgetting many
> of the words.
> Instead I tried to write a program which takes a base noun and inflects it
> in all possible ways (this is not easy, because there are nearly a hundred
> cases on how to do that in Hebrew) and generates a "full" list of nouns.
> Dan Kenigsberg then did the same thing for verbs. We then added many
> other words we found in online newspapers, and the like. But our word list
> is *very* far from being complete: it doesn't deal with all cases yet, and
> the base-word list is very short.
Well arabic is the same .. but is we will depend on the base word derivation
it Will get very complex .. checking from a list of words is a better choice
since not all base words have all the possible derivations ..
> There's another problem I faced and till this day I don't know how to fix
> (which makes this whole Hebrew word list unusable): In Hebrew, the
> particles (like the English and, or, at, in, the, etc.) are one-letter
> (mem, kaf, bet, etc.) put in the beginning of the word. So I would like
> ispell to mark any valid word with a Bet (say) in front of it as valid
> also; So far I haven't been able to do so, even though I did try to create
> a matching hebrew.aff file - I probably did something wrong.
Same problem in arabic ... having all words in dictionary repeated with every
possible particles is worthless .. a different approach should be thought of
.. specially that the letters in the particles could be used in normal words
as well as words with particles ...
--
Yours
Isam Bayazidi
Amman- Jordan