[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Which type of mushaf ins Unicode encoding?



Gregg Reynolds wrote:
>> Thomas Milo wrote:
>>> graphemes. The Egyptian and Saudi editions express iqlaab with a
>>> ligature of vowel and small meem, your example shows a tanween
>>> ligature with small meem, but the underlying grapheme is identical:
>>> tanween+iqlaab.
>>
>> I would strongly urge you not to construe these as "ligatures".
>> "Ligature" is a term of art in modern computational typography.  I
>> don't believe a calligrapher writing a Quran would say a vowel
>> followed by a small meem is a single unit, let alone a ligature.  In
>> fact, the language itself indicates this: the operation of iqlaab
>> has nothing to do with the vowel; ditto for the operation of tanween
>> and ikhfaa.

1. ligature is a term native to hand written calligraphy, denoting the close
connection or touching of two or more letters.

2. I used it to describe the close connection or touching of two or vowel
letters: the tanween ligature

3. Though I was not referring to it, I would like to remind you that the use
of small meem to idicate iqlaab was introduced in a typeset qur'an: the 1924
Cairo Codex. Those calligraphers that I know would not even contemplate
using it when writing a traditional Mushaf.

>>> The first thing to agree on is to encode iqlaab as a separate
>>> grapheme. What rests then is how to encode tanween. Unicode adopted
>>> the tanween ligatures as separate codes. My opinion is that the
>>> ligatures fathatan, dhammatan and kasratan are not graphemes, but
>>> ligatures consisting of exactly what their Arabic names indicate:
>>> two fathas, two dhammas and two kasras.
>>>
>> My understanding is that Unicode does not construe the -atan
>> codepoints as ligatures but as single things.  They were adopted
>> because that's the way all the legacy encodings did things

Your are correct. By calling the -atan group ligatures I am in fact
criticizing the Unicode Standard for compromising its non-ligature policy. I
my analysis, the -atan units are

1. graphical ligatures and
2. consist of two graphemes each.

The historic truth is that they develped from two sequential or turned
copies of the single letter consituents. What I would like to bring to the
attention of this forum that it is not absolutely necessary to encode
analogously. Instead, there is an opportunity to encode the basic icraab
unit followed by a tajweed unit.

>> Also I'd be careful about using "grapheme"; it may be the best and
>> most accurate terminology, but that doesn't mean the Unicode crowd
>> accepts it; in fact I predict that if you say "Unicode encodes
>> graphemes" on the Unicode you'll get a lot of howling.  "Abstract
>> character" is the unicode way.  My own preference is "semantic unit"
>> or the like.  Don't look for a lot of logical precision and
>> consistency and simplicity in the language of unicode.  :(

I am not using this term vis-a-vis a Unicode "crowd" - if such a race at all
exists, but to explain to this forum how I build my analysis. Use of this
term in the Unicode Standard is careless an oblivious of the analogy with
and possible origin in Prague School Phonology.

Regards,

t