[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Farahidi - huh ?

--- Nadim Shaikli <shaikli at yahoo dot com> wrote:

> Abdulaziz, I stumped on 'farahidi' [1] on the adawat project [2] page
> and
> for the life of me I can't remember what the heck that code is
> (without
> looking into the source).  Could you please describe it here so you
> can
> link that description to the project blurb entry.  Do __please__ also
> put something meaningful in its CVS "README" and "NEWS" files.

Yes yes I know. I admit I sort of forgot about that project completely,
as I was too engrossed by other stuff. Moreover, I recently learned
(and neglected to look into it) that Duali can already conjugate verbs,
or provides some sort of similiar functionality, making this project
unnecessary. I thought I explained what the project is but I still get
confusion and question marks. Am I not explaining it correctly? :)

Ah well here goes:

Farahidi is an Arabic verb conjugator. A "Musarrif" if you will. As a
simple example, you can give it as input _any_ Arabic verb in the bare
trinary form or root, and you can specify that you want this verb in,
say, the present male plural form, and thats what you get as output. If
you give it the trinary verb "ksr" and ask for the present male plural
of the verb you get "yaksirun". If you give it the trinary verb "nbt"
and ask for the past female singular, you get "nabatat" and so on and
so forth.

Is this explanation satisfactory? Or is it still confusing? I'll put a
better one on the README and NEWS files soon.

> Do also note a TODO section in case there might be those interested
in > helping out.

Very well, in passing I feel the funcionality the project provides is a
mere novelty. Not really that useful. However, if we could reverse the
process, if we could get the bare trinary verb out of some other form
that would be immensely helpful in searching and stuff. Why? I'll tell
you why.

An Arabic text contains verbs in many forms that all come from the same
root but look nothing like each other. An smart English search engine
can tell that "get" is related to "gets", but no Arabic search engine
can tell that "ksr" is related to "yaksirun". So if your searching a
text for the former and it only contains the latter, you get 0 results.
However if you "rootify" the verbs of the text before searching, you
find it. Clear? :)

> In passing, all those files had DOS gunk in 'em so I converted all
> via
> 'dos2unix'.  Most of the files are also under the evil CP-1256
> encoding
> while vrb_rule.c is a mystery to me (I have NO idea what encoding its
> using) - so I opted not to touch 'em or convert 'em in hope of you
> taking
> care of that so as to get 'em all be under the proper UTF-8.

Whats worse, there are some files missing, and I just cant get a hold
of the original author. I'm talking about the database files. I might
have to rewrite them myself, once I know what they were for exactly.

> BTW: does the name (as odd as it is to my ears :-) mean anything ?

Yeah, the name of a fellow. Following the example of Duali, this was
named after an Arabic linguist also.

I'm really sorry if this project seems like a big mess to everyone, I
just didnt give it any priority. Hopefully I'll be able to do things
with it once I know how to go about it.



Do you Yahoo!?
Win 1 of 4,000 free domain names from Yahoo! Enter now.