Gregg Reynolds wrote:
following formula, that I hope this community will endorse:
tanween = <vowel> <vowel> + [optional] <modifier>
<vowel>= fatha / dhamma / kasra <modifier>= tamweem / sequentializer
For backward compatibility,
<vowel> <vowel> = fathatan / dhammatan / kasratan
Hmm. In my opinion, it would be both more useful and more accurate historically to simply have a couple of TANWEEN codepoints. If I'm not mistaken, tanween was originally marked using a small nuun and later evolved into the doubled vowel mark.
Historically speaking, I do not agree. I have never seen a trace of a small nuun. They earlies markers were horizontally repeated coloured vowel dots (see: Yasin Dutton).
BTW, I designed a computer-aided, reversible transcription system (with fall-back transliteration) which you can download for evaluation from Basis Technology: http://www.basistech.com/arabic-editor/)
Looks interesting. I'll have to take a look.
In that transcription your first sample reads as follows:
kitaabu-n (DMG: kitābu-n)
The qur'anic assimilation of second one is not yet supported, but it will read like this: khushubu- m:usannadätu-n (DMG: ḫušubu- m:usannadätu-n) As you can see, initial compensatory shaddä is treated differently from morphological shaddä.
What's the objection? It would be just as transparent as you solution.
I have to think some more about the paired vowels idea.
Anyway, I like your approach. If it is to find any acceptance, there needs to be canonical equivalence with legacy encoding accoding to this formula:
TANWEEN = <vowel><small noon> = conventional tanween TAMWEEM = <vowel><small meem> IDGHAM = <vowel><idgham code>