[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Tanween variants and Unicode

Hello Meor,

From: Meor Ridzuan Meor Yahaya <meor dot ridzuan at gmail dot com>
>You can view samples of pakistan style mushaf at
>http://www.quranpak.com/samples.htm . The company sells Quran
>publishing software, with the calligraphy style preserved. I've no
>idea how they implement it, but I doubt it'll be unicode compliant.
> In any case, I think you already know the difference. The superscript
>alef is used as mad symbol (the "a" sound, 2 harakat, without the
>fatha) in most other writing style, while in madinah mushaf it only
>have small alef, which represent a required alef in pronunciation but
>missing in writing. The alef might be mad , might not, but does not
>represent the "a" sound. The "a" sound always required a fatha.

I looked through the Pakistani Quran at the link. The samples only seem to contain a portion of Surat Al-Baqarah. After a quick browse I didn't encounter the usage of small alef as anything other than the representation of a required alef in pronunciation but missing in writing. Can you point me to a specific PDF and verse in the samples?

Looks like this company is doing what many others such as Harf, etc are doing; using their own non-standard encoding scheme. It might be partially based on Unicode but it's surely not Unicode since Unicode yet does not support all the features necessary for Quran printing. They've done a good job mashallah. Although I would be more interested in seeing scans of a traditional Pakistani mushaf rather than recent computer generated output. If we take any proposal to Unicode it should be scans from a traditional mushaf.

>You can find standalone small alef at sura 2, aya 72. I'm not sure
>about other places.

You're talking about the small alef with hamza on top in faaddaara'tum. I don't see how this small alef is fundamentally different than other small alefs. When you say a standalone small alef, what do you mean? As far as I understand all small alefs in the Madinah Mushaf could be considered "standalone" since they simply substitute an alef that is pronounced but not written. This one happens to have a hamza over it of course which makes it a little different but it still has the one and the same function, which is substituting for an alef that is pronounced but not written. I don't see a different function of small alef in this example.

I think Unicode made a mistake in calling this U+0670 character Arabic Letter "Superscript" Alef and then further confusing the encoder by putting a note that says "actually a vowel sign, despite the name. In English textbooks for Arabic this character is mostly referred to as "Dagger" Alef and is not really that much of a superscript character. Superscript implies that a character is placed at a higher plane than other characters like the "squared" of x^2 but dagger alef can be placed high or low within the word based on the sorrounding characters. Like this 2:72 example, because it is preceded by a "ra" it is placed low. When it is preceded by some other letters is it positioned high. Its function doesn't change, it is still simply a symbol that represents an alef that is pronounced but not written.

>You can find superscript waw at sura 17, aya 7. I think it only has
>one occurance in the Mushaf.

OK I'm trying to refresh my memory on this one here. I know we discussed this one and I kind of remember that we had concluded that this deserves its own codepoint but I can't remember why. Can you remind me what the functional difference of this small waw is compared to the other small waws in the Madinah Mushaf? (the small waws that usually come at the end of certain words)

>For both tanween ending with meem and sequential one,  we might have
>implementation problem because of the following:
>For a pure truetype font, it is almost impossible to implement (I
>think ), without opentype support (the GSUB table). I think the same
>applies to bitmap font.Suppose the rendering engine encounter the
>sequence (fathatan + sukun, or fathatan + superscript meem),  the
>engine will know that it needs to replace it with a new glyph.
>However, since the new glyph does not have a unicode code point, how
>is the rendering engine will find the glyph? In truetype, each glyph
>has it's own index number, unicode code is optional. However, there is
>no standard in ttf which glyph  should be assign to which index no.
>So, to my knowledge it is almost impossible to implement this feature
>using truetype, bitmap (bdf, windows .fon), and postscript font (not
>to sure about the last one, but i think it is the same), unless
>someone can tell me how this can be implemented , or some features
>that I'm not aware of about these font. This issue does not arise with
>opentype, since it has GSUB table. The font designer can easily tell
>the rendering engine to substitute the sequence with the glyph he
>wants (without the need ot unicode code point). So, please do consider
>this issue.

Unicode Technical Commitee will not accept the addition of a new codepoint because a certain legacy font technologies is not capable of rendering it without a new codepoint. OpenType and other similar modern font technologies can easily handle this as you write. Besides someone who is trying to render Madinah Mushaf would not use primitive font technology anyways, they would use a more modern font technology such as OpenType. Otherwise the result would be really low quality. 

Just to re-iterate, UTC is very conservative in terms of the addition of new codepoints. If there is an existing codepoint that will take care of the problem then this would be the preferred proposal.

Looking foward to hear from you again.


Mete Kural
Touchtone Corporation