[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Quranic Proposal



Dear Abdulhaq,

> This would mean the acceptance of items 1 - 6, 9, 10, 11 and 12. If you
don't
> want them in the unicode standard, then they will have to be dynamically
> inserted by the rendering algorithm, which as I am saying is very prone to
> error.

Dynamically inserting tajweed is theoretically very well possible, but it is
not part of our present discussion.

To deal with tajweed as plain text, all the necessary characters _are_
already in the Uncode standard. Please note that the sample text that Metin
posted to illustrate that is already encoded in Unicode, complete with
typo's :-), in the following way:

1-3 "Tangweeng" was encoded by repeating the single vowel character: 1.
fatha fatha 2. damma damma 3. kasra kasra.

4-6 "Tamweem" is encoded by adding 06E2 arabic small meem isolated form to
the the single vowel character.

The necessary miniature meem is available in Unicode as 06E2 arabic small
high meem isolated form. Beware that this is a misnomer, since a single 06E2
arabic small isolated form would have sufficed (together with the
corresponding 06ED arabic small low meem isolated form these characters were
specifically included for dealing with this kind of tajweed; 06ED small low
meem is a contextual variant of 06E2 small high meem triggered by kasra; and
should therfore be ignored.

The fact that nobody knows that Unicode included these small meems
specifically for rendering tajweed is the real weak point here. If a
standard includes "obscure" specialist characters, it should leave a trail
of documentation so that later implementors like you know how to use them.
Creating clear documentation and instructions should be part of our present
effort.

Regular tanween, the phonetically neutral nunation, is written with the
inverted dammatan ligature and vertically stacked fathatan/kasratan similar
to those used for MSA. A request from Unicode to change the - IMHO
erroneous - practice to use a single ligature to represent final vowel
redupiclation would meet with massive resistance from the industry.

9 -- 12 are redundant. They follow the erroneous precedent of encoding
glyphs (which is discouraged by the Unicode Consortium):

U+FD3C Arabic Ligature Alef with Fathatan Final Form
U+FD3D Arabic Ligature Alef with Fathatan Isolated Form

Both these ligatures is described in the standard as the contextual
Presentation Form for rendering U+0627 Alif and U+064B Fathatan. The latter
two characters are the real Unicodes to be used in plain text. The proposed
ligatures FD40-43 would also have to be defined in terms of plain text
equivalents. However, ligatures do not belong in the Unicode Standard, their
inclusion was a mistake and no more ligatures will be added in the future.

It is remarkable that nobody ever notices the blatant mistake in the
prescribed equivalence for these ligatures as U+0627 Alif and U+064B
Fathatan. This is the wrong order for encoding tanween in Classical and
Qur'anic Arabic - and IMHO contemporary Arabic as well. It should have been
U+064B Fathatan followed by U+0627 Alif - a TRAILING ALIF. Including
typographic errors like this in an Industry Standard is the hallmark of
engineers doing script encoding instead of scholars.

As a general rremark I would like to point out that, due to the by
definition conservative character (sic!) of Industry Standards, there is no
hope in the world of getting a structurally clean solution for Qur'anic
Arabic - or even for Arabic at large. The reason is, that none of the Arabic
encoding patterns or font designs were researched by and for scholars and
calligraphers, but by employees of engineering companies with the short term
commercial objective of arabizing as cheaply and as fast as possible
whatever product they had that was originally made on the assumption that
Latin characters rule the world. It was from the junk yard of trashed legacy
code patterns that Unicode picked its Arabic code.

The only possibility to accomplish a robust solution for encoding the
Qur'an - or any Classical Arabic for that matter - in the Unicode format
would be designing a code set from scratch and apply for it's inclusion in
the second plain as Historic or Diachronic Arabic. This is exactly what I am
working on, including the conversion schemes to upgrade the Arabic
industrlal rubbish in Unicode or interchange with it.

t