[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Fwd: Arabic Spellchecker for OpenOffice



I don't care if anybody takes it or forks. I just thought it would be easier to have one person update it. Our language is beautiful and expansive and I do not mind any additions to it. I just meant to make the task easier.
Jabs

 
On 3/24/06, Mohammed Sameer <msameer at foolab dot org> wrote:
>    that I collected over the years, but I have recently had the time to write a
>    Perl script that breaks a text into individual words and then dedupes them
>    to make sure every word is unique. Since then I ran this program on millions
>    of words and it works 99% perfect. There is some rubbish but it is easy to
>    spot and clean manually. I will send you all a preliminary list this next
>    Monday. If it is ok I want to be the filter for updating such a list if that
>    is ok. I believe it will be easier this way. Let me know what your thoughts
>    are.

That's not a big problem, I wrote a small C program that'll only output valid Arabic
text from any file, Using cut, tr, sort, uniq one can easily obtain the required format.

Being the filter for updating or not is not something I can decide, You can either
be part of Arabeyes and you'll be the only one to update or do it on your own and you
will be the only one to update it too.

But be aware that anyone can simply take the word list and fork, You can't control
something like that, Honestly, I also don't like the fact that you'll be the only
one who'll update it.

Best regards,


>
>
>    Jabra
>
>
>    On 3/24/06, Mohammed Sameer <[1]msameer@foolab.org> wrote:
>
>      Hi,
>      For the 1st time since a long time I'll pop in.
>      Please guys, The plain word list approach will work. I was objecting to
>      the affix
>      thing not because it's bad, It's because it'll not be completed thus it'll
>      delay
>      and stall the whole process "and it happened as I said".
>      Let's go with the word list approach and then later the affix can be done
>      Check this: [2]ftp://foolab.org/pub/software/aspell/aspell- quran.tgz
>      That's a plain word list generated from the words of the holy Quran
>      without the
>      affix approach and it works fine
>      Please jobs, If you have such a list and you know that the spelling of its
>      words is
>      correct, Please release it, I'm welling to maintain it and add missing
>      words as long
>      as I'm able to.
>      Flame me for the above words, Do anything but please release it if you
>      can.
>      Regarding the copyright: You can't copyright a word list, You can only
>      copyright the
>      representation of words in the list. Alaa, Please correct me if I'm wrong.
>      PS. You might like to read this to understand the different stages I
>      passed by!
>      [3]http://www.foolab.org/node/1439
>      [4]http://www.foolab.org/node/1482
>      Best regards,
>      --
>      GNU/Linux registered user #224950
>      Proud Egyptian GNU/Linux User Group <[5]www.eglug.org> Admin.
>      Life powered by Debian, Homepage: [6]www.foolab.org
>      --
>      Don't send me any attachment in Micro$oft (.DOC, .PPT) format please
>      Read [7]http://www.gnu.org/philosophy/no-word-attachments.html
>      Preferable attachments: .PDF, .HTML, .TXT
>      Thanx for adding this text to Your signature
>      -----BEGIN PGP SIGNATURE-----
>      Version: GnuPG v1.4.2.2 (GNU/Linux)
>      iD8DBQFEJANyy2aOKaP9DfcRAiT8AKCY3pvRWTIzPlEQazvmCQ0BkKxLzACfZyRy
>      Zb3v3FMp7tg4IUxnujkx6SY=
>      =psF4
>      -----END PGP SIGNATURE-----
>
>    --
>    Fortuna ventus mens ut est paratus.
>    ****************************************************************************
>    ********
>    Language Hacker. Creator of Localized or translated culturally aware
>    services, content, technology, training and experiences targeted to
>    individuals and professional services firms with projects related to the
>    Middle East region.
>
> References
>
>    1. mailto:msameer at foolab dot org
>    2. ftp://foolab.org/pub/software/aspell/aspell-quran.tgz
>    3. http://www.foolab.org/node/1439
>    4. http://www.foolab.org/node/1482
>    5. http://www.eglug.org/
>    6. http://www.foolab.org/
>    7. http://www.gnu.org/philosophy/no-word-attachments.html

--
GNU/Linux registered user #224950
Proud Egyptian GNU/Linux User Group <www.eglug.org> Admin.
Life powered by Debian, Homepage: www.foolab.org
--
Don't send me any attachment in Micro$oft (.DOC, .PPT) format please
Read http://www.gnu.org/philosophy/no-word-attachments.html
Preferable attachments: .PDF, .HTML, .TXT
Thanx for adding this text to Your signature


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)

iD8DBQFEJCn4y2aOKaP9DfcRAlFoAJ49hu1t9QpD//k8X2JzsRj9Ou2vHQCeIFGH
vOkoSSKvDxy8JKhtwCx8JWM=
=8Q60
-----END PGP SIGNATURE-----





--
Fortuna ventus mens ut est paratus.
************************************************************************************
Language Hacker. Creator of Localized or translated culturally aware services, content, technology, training and experiences targeted to individuals and professional services firms with projects related to the Middle East region.