[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: more on Duali
- To: developer at arabeyes dot org
- Subject: Re: more on Duali
- From: Mohammed Elzubeir <elzubeir at arabeyes dot org>
- Date: Tue, 13 Aug 2002 10:25:14 -0500
- Cc: Kareem Darwish <kareem at glue dot umd dot edu>
- User-agent: Mutt/1.3.28i
On Tue, Aug 13, 2002 at 12:53:20PM -0700, Nadim Shaikli wrote:
>
> I'm certainly no expert in this field (its been way TOO long since I've
> looked/learned all these interesting rules), but I would tend to think
> that generating a list of root verbs (be it 3 characters or more even)
The vast majority will be 3 letters, some are 4.
> and define a set of rules for prefix and suffix ought to work. One thing
> to note - if memory serves there are lots of exceptions to all the rules
> in Arabic grammar and so I would not be surprised if the prefix/suffix
> rules you come up with will not hold true for all the "root" verbs; meaning
Problem is 'what are those exceptions?' ;) I can catch and modify some of them
out of my long-forgotten Arabic background.. or tell by visually going through
the output it's generating. I just don't see it as an effective method.
> there will likely be 4-5 prefix/suffix groupings and so it then becomes an
> exercise in data-collection (ie. how to generate this root verb list and
Yes, I can categorize them (the same way I do with the derivative templates).
But as with the derivatives, it's a matter of mathematics.. but for a
prefix/suffix categorization I need more Arabic background. For example,
Kareem's morpho3[1] (which does a lot of this already) has prefix/suffix lists.
It has been the best starting point I could find. However, what goes together
and what doesn't.. and what prefix would render another suffix impossible to
be in one word.
> groupings). Once the basics are in place, I'd suggest grabbing a number
> of Arabic websites' data and running duali on it in order to generate a
> wider data set.
>
I don't really want to go there, as I can't trust any site for having all
correct spelling. However, I did entertain the thought of using the Quran text
for testing purposes. Are those XML's complete, M.Yousif?
References:
[1] Morpho3 - http://www.glue.umd.edu/~kareem/hamlet/morph/
later
--
-------------------------------------------------------
| Mohammed Elzubeir | Visit us at: |
| | http://www.arabeyes.org/ |
| Arabeyes Project | Homepage: |
| Unix the 'right' way | http://fakkir.net/~elzubeir/|
-------------------------------------------------------
---
Was I helpful? Let others know:
http://svcs.affero.net/rm.php?r=elzubeir