[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Proposal for the Basis of a Codepoint Extension to Unicodeforthe Encoding of the Quranic Manuscripts

Hello Abdulhaq,

I appreciate your efforts to take the encoding of the Qur'an seriously. However it seems that we have differing opinions, but we are not that far from each other. I will make a few more comments below in hopes bridging the gaps a little more.

>one problem is that the rendering problem has been 'almost solved' for a long 
>time now.

You are right the missing codepoints need to be addressed as soon as possible. I am for a proposal to get them in Unicode 5.0. For that we have to act soon. Tom already has something he's been working on in this regard.

>The other is that this solution, as I understand it, depends on OpenType 
>capability font rendering to work.

It doesn't depend on OpenType capability. OpenType is just one of the font technologies you can use to build fonts that can render complex Arabic. You could surely use any other technology you wish to use. Although the font technology you are using will probably have to be advanced enough to do certain things such as contexual substitutions, etc. OpenType is just one of the technologies which is able to do these.

>And the main problem from my perspective is that the encoding of the extra 
>non-textual signs is entirely glyph-based rather than semantically based. 
>This means that it is not a long term solution and only wholly valid for a 
>certain region of the muslim world and a certain time-frame. I'm sure Thomas 
>agrees here BTW.

This non-texual sign is not glyph based since it does not represent a specific typeform. The non-texual codepoint that we are proposing here would cause fathatan to form into a sequential fathatan, dammatan to form into a sequential dammatan, and kasratan to form into a sequential kasratan when put after a fathatan, dammatan, or kasratan respectively. So its context determines its visual effect.

>It seems to me that we agree that the unicode spec needs extending, but we 
>have different proposals for that. You want to, for example, glue two glyphs 
>together to represent a new codepoint. This is because you do not want to use 
>codepoints other than the Unicode ones as you say this will break 
>compatibility. However, of all the instances of font renderers out there, how 
>many will render your new glued-together 'codepoints' correctly? Even if they 
>support OpenType? (After all, how does the opentype engine know if you want 
>two sequential fathas or fatha bi-ikhfaa'?)

We are not proposing to glue two glyphs together. You can compare this non-texual character to the ZWNJ (zero width non joiner) character in a way (although they are still quite different) in the fact that they are both non-texual. I think you are confusing this with Tom's other proposal, which is to declare canonival equivalence between a fatha+fatha sequence and fathatan, and so forth.  That is a seperate issue. I think we haven't made this clear so I apologize. This non-texual character is a seperate request regardless of whether there is a request to declare canonival equivalence between a fatha+fatha sequence and fathatan, and so forth. Even if there is no canonival equivalence declared between a fatha+fatha sequence and fathatan this additional codepoint is needed.

>I, on the other hand, want to set up a private user area that will allow us to 
>1) Start encoding the quran properly, in a semantically-based fashion.
>2) Support current font technology and devices using that (including truetype 
>and bitmapped fonts) by providing a secondary table of any pre-composed 
>glyphs required.
>The disadvantage to this, and you've mentioned this too, is lack of support. 
>And yet, with a good OpenType font, this can be beautifully supported even 
>now by most Windows applications (for example). Meor mentioned that your 
>sequential-fathatayn doesn't work with the MS renderer (if I remember 
>correctly). Thomas suggested that he file a bug with Microsoft. I presume he 
>was being tongue-in-cheek there! You have to pay MS money for them to 
>register a 'bug' - then they'll tell you that it's a feature (and I would 
>agree with them in this instance).

Again, the fatha+fatha sequence being canonically equivalent to a fathatan codepoint is a seperate issue than the need to support sequential tanween. In fact the canonical equivalence between fatha+fatha and fathatan is not even needed to render the Quran correctly. It's just an additional measure to make the encoding nicer.

Kind regards,

Mete Kural
Touchtone Corporation