[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Proposal for the Basis of a Codepoint Extension to Unicode forthe Encoding of the Quranic Manuscripts



Mete Kural wrote:
Hello Gregg,

---------- Original Message ----------------------------------
From: Gregg Reynolds <gar at arabink dot com>

d. You don't need higher-level grammars like XML. My own opinion is that primary goal of an encoding design should be to migrate intelligence out of the application and into the text, subject to the syntactic constraints of a plain text encoding. So long as you can give a clear and concise definition of a particular semantic category, it is a good candidate for encoding as plain text.


Well I think Unicode is useful as it is. Unicode is encoding the Arabic script rather than the Arabic language so IMHO the kinds of things we are asking of here fall into a higher level grammar. The problem that we are currently trying to address in regards to the Quran has already been addessed or being addressed for the Bible by OSIS (http://www.bibletechnologies.net/). In the OSIS XML specification, morphemes and other units of a word can be encoded using specific XML elements. Unicode is still the encoding model for the script level encoding and XML overlays to encode higher level semantic of the text. So basically we're talking about XML over Unicode. I think that is the way to go. Inventing our own character encoding model outside of Unicode is not going to receive support except from a handful of people. I believe in utilizing already available standards, especially when they "can" address the needs of the community. With XML over Unicode what we are discussing here
ca
 n be done, and OSIS has already done it for the Bible using XML over Unicode. Check out their spec:
http://www.bibletechnologies.net/OSISUserManual21draft.dsp

Hi,

It's certainly possible to create an XML grammar for the Quran, but surely e.g. an "ikhfaa" codepoint makes more sense than something like <ikhfaa/>. And Unicode is undoubtedly useful; it just isn't useful enough.

-g