[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Volunteers for verifying the quran data



>From: Gregg Reynolds <gar at arabink dot com>
>I guess what I'm suggesting is an intellectual exercise in encoding 
>design.  Do the cost/benefit analysis for any given codepoint; then 

>Thanks.  I believe the Text Encoding Initiative has a bunch of stuff 
>like that too.

Yes. TEI does too. And both OSIS and TEI are extendible so you can extend the XML schema to support something you really want that is not already supported. In fact TEI and OSIS cooperate together and OSIS is very very similar to TEI in many aspects. OSIS was simply created with scripture encoding focus whereas TEI is general purpose.

Basically what I am suggesting you is to do this intellectual exercise in morphemic encoding design at the markup level, not at the character encoding level. That's where it belongs. That is partly why initiatives such as TEI and OSIS exist. I suggest that you read up on TEI and OSIS and think about ways to extend them to support detailed text analysis of Arabic.

IMHO, in regards to script-level/graphemic/abstract character or whatever you wanna call it encoding that is in the scope of Unicode so if there are codepoints you feel are needed within this scope, strive to get those added to Unicode. But keep the morphemic and other similar semantic to the markup level.

Kind Regards,
Mete

--
Mete Kural
Touchtone Corporation
714-755-2810
--