[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Arabic spellchecker

Question of a beginner in arabic. I feel that the idea of AFFIX or suffix is
more related to latin language. Would it not be possible to work with those
forms that are (as I feel) the basic of arabic. I remember trying to learn of
the first, second form, and those faaEl faEl....  and so on. But of course that
may seems easier when you're a beginner.

Quoting Ahmad Khalifa <ahmad at khalifa dot ws>:

> Abdalla Alothman wrote:
> > Asalamu alaikum.
> Salam,
> > I did something exactly the same way because it was feasible. ;)
> > 
> > I agree the approach is far from being organized.
> You mean you already have such a wordlist ? I would be interested
> in taking a look at it, if you don't mind. I would like to see how it
> performs in OO.o.
> >>This is where its difficulty lies. Defining the AFFIX rules and
> >>writing a *flagged* wordlist.
> > This is a real problem.
> > If:
> > رءى
> > is the root for:
> > أريناك
> > chances for a findig a pragmatical way, or a decent pattern, could be
> difficult. Not
> > to mention that the AFFIX rules would be useless, in my humble opinion
> (don't let me
> > put you down).
> But consider AFFIX rules augmented with INFIX ?! :)
> Not just PREfix, and SUFfix, but also INfix, which is insertion in the 
> middle by means of index. Ofcourse the INFIX approach would be costly to
> adapt, as we'd have to submit patches to Aspell/Myspell and have INFIX
> widely accepted.
> > For fun, consider modern Arabic terms -- one that I can't forget was
> "maykanat"
> > (automating). The root is MKN (e.g., wallatheena inn makkannaahum fil
> ardh...).
> > Problem is that the yaa comes exactly in the middle of the root. Same goes
> for
> > kitaab, the alif comes in the middle of the root. If you could solve such
> cases,
> > I would be very much interested to see your work.
> The way I see it, we have two options.
> 1- Add INFIX to the AFFIX rules. That way you can describe KETAB by
>     flagging the root KTB
> 2- Add KETAB as an entry of its own beside KTB. That way you can combine
>     KETAB easily with the 'AL' prefix rule, PLUS you still get only one
>     entry for the 15 entries of KTB.
> I am in favour of the second approach. Its faster to adapt, does not
> cost much, and would make it easier to define rules for NOUNS.
> Its only downside is that for most root verbs that can be derived to
> nouns, you get 2 or 3 entries. 1 for the verb and its derivatives, 1 for
> the noun KETAB, and one for the MAKTAB noun.
> I think 3 entries per root beats 17 entries, no ?
> Right now, ammar is working on elzubeir's "Arabic Grammer Rules"
> document,
> http://cvs.arabeyes.org/viewcvs/projects/duali/doc/arabic-grammar
> I think its the key to developing all the AFFIX rules, as we need to
> formally categorize ALL the arabic language words to be able to write
> the AFFIX rules.
> When the document is finished, we can better estimate the need for INFIX
> Please let me know what you think of the two approaches above.
> > I wish you goodluck insha-Allah.
> Thank you.
> -- 
> Salam,
> Ahmad Khalifa
> _______________________________________________
> Developer mailing list
> Developer at arabeyes dot org
> http://lists.arabeyes.org/mailman/listinfo/developer