[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Questions about yeh, hamzah on yeh, alef maksura and dotless ba
- To: "General Arabization Discussion" <general at arabeyes dot org>
- Subject: Re: Questions about yeh, hamzah on yeh, alef maksura and dotless ba
- From: "Thomas Milo" <t dot milo at chello dot nl>
- Date: Sat, 31 Dec 2005 12:14:47 +0100
heer wrote:
>> Meor,
>>
>> As far as searching is concerned it doesn't matter whether the ya'
>> (yeh) has dots or not. In searching one types in ya'. If the
>> letter is alif maqsurah one types in alif maqsurah. The problem is
>> that Arabic orthography has never been completely standardized and
>> it may not be possible to search for words on the basis of Qur'anic
>> orthography since many of the Qur'anic symbols are not in Unicode.
All Qur'anic codes are in Unicode. Meor's work illustrates that. Where he
deviated form Unicode, i.e., inserted illegal codes in the Arabic block, he
did so because of constraints in his font technology.
>> What one does with the commercial Qur'ans like the one produced by
>> Sakhr/Harf is to search for words on the basis of modern Arabic
>> spelling. Somehow the search program finds the correct words in the
>> Qur'anic text even though they may be spelled differently. Perhaps
>> one is actually searching a text of the Qur'an in modern spelling
>> which is somehow linked to the traditional spelling.
Ignoring dots is similar to ignoring harakaat. The problem with Unicode is,
is that Harakaat are recognized graphemes, whereas dots aren't. I suspect
Sakhr internally breaks down dotted letters into separate skeletons
(archigraphemes) and dots paterns (distinctive features) after all. That
would account for your observation.
t