[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Tanween variants and Unicode

Mete Kural wrote:

New Protocols that are needed: ------------------------------

- Tanween ending
in meem: fathatan+superscript meem will trigger the "tamweem" symbol,
and so forth for kasratan+superscript meem and dammatan+superscript
meem. No new character code is needed, just a protocol that explains
that the combination will trigger the corresponding glyph.

I must respectfully but vehemently object. You can't just merrily redefine the semantics of codepoints that are already well-defined. Fathatan means fathatan; any software that does not display it correctly is broken, by definition. Ditto for superscript meem. If the one follows the other, they must both be displayed.

Silent/sequential tanween: fathatan+sukuun code will trigger the
silent tanween/sequential tanween glyph, and so forth for
kasratan+sukuun and dammatan+sukuun. Sukuun is a good choice for a
codepoint here since the noon sound of the tanween is in a way
silenced. No new character code is needed, just a protocol that
explains that the combination will trigger the corresponding glyph.

Same objection. What if the author *wants* a sukuun over an -atan? By the way, what exactly is a "silent/sequential" tanween? All tanween variants have names in Arabic that translate quite well into English; why not use them? By my reading, there is no such thing as a "silent tanween"; there is an assimilated tanween, but assimilation and silence are not the same thing. "Sukun" is definitely the wrong term.

See section 1.10 of http://www.arabink.com/patacode/encoding.pdf; see also the bottom of p. 31 / top of p. 32.

This sort of redefinition may be ok for private experimenting, but as a proposition for standardization it's frankly a terrible idea. Even for private experimenting it isn't a good idea; the PUA is set aside specifically for stuff like this.

New canonical equivalences (this one is not absolutely needed for the
Madinah Mushaf): ---------------------- - Basic tanween canonical
equivalence: fatha+fatha needs to be made canonically equivalent to
fathatan, and so on for kasratan and dammatan.

Here's the problem with this: why stop there? You can use precisely the same argument to say that two consecutive vowels within a word should be interpreted as one vowel + vowel lengthener. E.g. kitAb spelled kitaab. Technically speaking, the alif in kitAb in fact denotes a lengthening of the preceding fath, just as the second vowel in -atan denotes /n/. Now consider kitaabaa - should the final aa be an alif or a fathatan?

Plus, what does this do for searching and sorting? A search for e.g. fathatan won't find two consecutive fathas. So if you do this sort of thing you'll get surprised users. OTOH, nothing says an editor can't map two consecutive punches of the fatha key to the fathatan codepoint.