[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Arabic spellchecker

On Fri, Nov 11, 2005 at 10:27:36PM +0200, Ahmad Khalifa wrote:
> Salam,
> I think its general consensus that a Myspell/Aspell AFFIX dictionary is
> the best way to go. We will ignore harakat for now. Ammar and Nadim seem
> to agree on this.
You can't ignore harakat. You can ignore them initially but you can't really ignore
them forever because you'll need them to get a grammer checker
and because a word may be correct, but incorrect in its position.
This can't be detected without harakat.

I'm not sure whether the AFFIX dictionary can support harakat or not, I'm afraid that
we might depend on something and find it not to be suitable later.

> So, as you can see, creating an arabic spellchecker is only a matter of
> populating 2 files and plugging them into OOo, or using Myspell/Aspell
> standalone.

That's another problem, Creating the data files.

Let me say it, This won't happen.
You won't get the data files you need, Arabs won't create it, Arabs suck.

Maybe what you suggest is much better than the current situation but let's start
with what we have already and oneday we can switch to your approach.

> NOTE, This is a completely different approach than duali, Im not sure
> duali's cvs is a good place for it.

At the moment I have my own spell checker implementaton, baghdad.foolab.org
It's still in alpha stage and it might not be that accurate, 0.0.1 can't yet
corrections but the CVS can "although still in the early stage".

We have a base which is the duali data files "And the original data files on
which the duali data files were built were not done by someone from us"

Implement something, Let it be bad but it's already there, Start to adapt the
applications, This might take time, During this improve what you have.

At the end you should get something.

Ahmad, Start with what you have, Don't waste tme in endless discussions.

Let the discussions be but while you are implementing.

Karnighan said it: 1st make it run, Then make it run better.

Best wishes,

-- Katoob Main Developer, Arabbix Maintainer.
GNU/Linux registered user #224950
Proud Egyptian GNU/Linux User Group <www.eglug.org> Admin.
Life powered by Debian, Homepage: www.foolab.org
Don't send me any attachment in Micro$oft (.DOC, .PPT) format please
Read http://www.gnu.org/philosophy/no-word-attachments.html
Preferable attachments: .PDF, .HTML, .TXT
Thanx for adding this text to Your signature

Attachment: signature.asc
Description: Digital signature