[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Farahidi - huh ?
- To: Development Discussions <developer at arabeyes dot org>
- Subject: Re: Farahidi - huh ?
- From: Nadim Shaikli <shaikli at yahoo dot com>
- Date: Fri, 20 Aug 2004 23:55:07 -0700 (PDT)
--- Abdulaziz Al-Arfaj <alarfaj0 at yahoo dot com> wrote:
> --- Nadim Shaikli <shaikli at yahoo dot com> wrote:
>
> > Abdulaziz, I stumped on 'farahidi' [1] on the adawat project [2] page
> > and for the life of me I can't remember what the heck that code is
> > (without looking into the source). Could you please describe it here
> > so you can link that description to the project blurb entry. Do
> > __please__ also put something meaningful in its CVS "README" and
> > "NEWS" files.
>
> Yes yes I know. I admit I sort of forgot about that project completely,
> as I was too engrossed by other stuff. Moreover, I recently learned
> (and neglected to look into it) that Duali can already conjugate verbs,
> or provides some sort of similiar functionality, making this project
> unnecessary. I thought I explained what the project is but I still get
> confusion and question marks. Am I not explaining it correctly? :)
You might have explained it, but its not there in the README files (in
CVS) or on the Adawat project page (you should link your previous mail
to a 'Farahidi' blurb).
> Is this explanation satisfactory? Or is it still confusing? I'll put a
> better one on the README and NEWS files soon.
Sounds fine. Do please populate the above mentioned files.
> > Do also note a TODO section in case there might be those interested
> > in helping out.
>
> Very well, in passing I feel the funcionality the project provides is a
> mere novelty. Not really that useful. However, if we could reverse the
> process, if we could get the bare trinary verb out of some other form
> that would be immensely helpful in searching and stuff. Why? I'll tell
> you why.
>
> An Arabic text contains verbs in many forms that all come from the same
> root but look nothing like each other. An smart English search engine
> can tell that "get" is related to "gets", but no Arabic search engine
> can tell that "ksr" is related to "yaksirun". So if your searching a
> text for the former and it only contains the latter, you get 0 results.
> However if you "rootify" the verbs of the text before searching, you
> find it. Clear? :)
Good point yet it might be more difficult that what you note. I'm
not sure how "smart search engines" do these searches in latin and
if past/future tenses are also searched when one enter a present
tense verb for instance (since it seems to me like that would
spawn an entirely new search - something I'm sure search engines
would like to avoid). In any regards, its something to ponder
and something to document in hopes of future work and/or attention.
> > In passing, all those files had DOS gunk in 'em so I converted all
> > via 'dos2unix'. Most of the files are also under the evil CP-1256
> > encoding while vrb_rule.c is a mystery to me (I have NO idea what
> > encoding its using) - so I opted not to touch 'em or convert 'em
> > in hope of you taking care of that so as to get 'em all be under
> > the proper UTF-8.
>
> Whats worse, there are some files missing, and I just cant get a hold
> of the original author. I'm talking about the database files. I might
> have to rewrite them myself, once I know what they were for exactly.
Ouch - sounds like the "facilitator" that helped you import all of this
messed-up in double checking your work ;-) Do please get the missing
files uploaded and corrected (the encoding is still odd). We should
continue to strive to have functional (or semi-functional) code in CVS
with plenty of docs and explanations along with TODOs for newbies to
pick-up where other might have left-off.
Thanks & Salam.
- Nadim
__________________________________
Do you Yahoo!?
Yahoo! Mail - Helps protect you from nasty viruses.
http://promotions.yahoo.com/new_mail