[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: more on Duali
- To: developer at arabeyes dot org
- Subject: Re: more on Duali
- From: Mohammed Elzubeir <elzubeir at arabeyes dot org>
- Date: Mon, 26 Aug 2002 10:34:12 -0500
- User-agent: Mutt/1.3.28i
On Mon, Aug 26, 2002 at 06:22:32AM -0400, Kareem Darwish wrote:
> AA,
> Did you get a chance to look at the AR <=> EN online dictionaries that are
> used in research? Here is a link to some useful tools and language
> resources.
> http://www.glue.umd.edu/~dlrg/clir/trec2002/resources.html
Yeah I've been there before. Nothing I could use for what I need specifically.
Though salmone.xml sounds interesting.. that might prove to be useful.
IE is taking its time opening it ;)
> The link has 3 dictionaries. One is the Salmone dictionary which uses
> roots as entries. The other two are dictionaries that were automatically
> generated by running IBM model-1 Machine translation model against 1.2
> million parallel AR-EN sentences from the UN. They are free distributable.
I'm having a look at those as well.
> Two more things:
> 1. I have a morphological analyzer (you probably already saw it) that tries
> to find the actual root.
Yes, I wrote my own implementation of that (it's more modular and easier to
maintain -- uses unicode objects and doesn't require all the switching
back and forth with cp1256 and whatnot).
It's available on the duali project page, for anyone who cares to
test/comment/etc: http://www.arabeyes.org/project.php?proj=duali
> 2. I have access to another analyzer that was developed by Tim Buckwalter.
> The analyzer is very accurate, but I am not at liberty of distributing it
> (copyright issues). However, if you send me text, I can run it through it
> for you.
Sure.. I would be interested to see what results it comes up with, compare
and contrast -- and hopefully make my implementation perform with close
accuracy ;) These things shouldn't have such strict distribution policies,
but oh well.. hopefully all this will change.
You can get a word list (dict_wordlist) is bundled with the gendic-0.1.tar.bz2
package I put for download. Please do try it and pass it along ;)
Thanks
--
-------------------------------------------------------
| Mohammed Elzubeir | Visit us at: |
| | http://www.arabeyes.org/ |
| Arabeyes Project | Homepage: |
| Unix the 'right' way | http://fakkir.net/~elzubeir/|
-------------------------------------------------------
---
Was I helpful? Let others know:
http://svcs.affero.net/rm.php?r=elzubeir