[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Question about research and sorting in arabic



Hello

I'm currently working on an Open Source program for public library (PMB on www.sigb.net) trying to make it use utf-8 in order to be able to enter books in many alphabet.
My main customer is a French Institute working on the near east, so having mainly arabic books.
After writing books in unicode, I need to search and sort correctly the books and authors.
My question is about the "empty words" that are mainly "empty prefix" or "empty suffix" in arabic.


Like if I have a word or an author begining with ??, I would like to sort the word by the following words, not finding them all under ?.
Same thing when looking for words, I want to find the docuements containing the word ??? even if I typed for ????? (like in google in english, they just don't look for "the" "a"...)


So I have French teachers of Arabic that can help me, but I would like to have your advice, hoping we can use this resarch for other tools. The software is in PHP/MySQL.

Can you tell me two thing :

- The empty suffix or prefix I can find in arabic

- if those suffix or prefix can be in a word without being empty (I'm thinking of the problem of ? which can be "or" or a letter of the word).

Thanks for any information, I'll try to summarize everything I'll get on the subject.

Regards
Armelle Nedelec