[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Proposal for the Basis of a Codepoint Extension toUnicodeforthe Encoding of the Quranic Manuscripts

To: General Arabization Discussion <general at arabeyes dot org>
Subject: Re: Proposal for the Basis of a Codepoint Extension toUnicodeforthe Encoding of the Quranic Manuscripts
From: Gregg Reynolds <gar at arabink dot com>
Date: Wed, 22 Jun 2005 07:42:39 -0500
User-agent: Mozilla Thunderbird 1.0.2 (Windows/20050317)

Thomas Milo wrote:

Gregg Reynolds wrote:

following formula, that I hope this community will endorse:

tanween = <vowel> <vowel> + [optional] <modifier>

<vowel>=  fatha / dhamma / kasra
<modifier>= tamweem / sequentializer


For backward compatibility,

<vowel> <vowel> = fathatan / dhammatan / kasratan


Hmm.  In my opinion, it would be both more useful and more accurate
historically to simply have a couple of TANWEEN codepoints.  If I'm
not mistaken, tanween was originally marked using a small nuun and
later evolved into the doubled vowel mark.

Historically speaking, I do not agree. I have never seen a trace of a small
nuun. They earlies markers were horizontally repeated coloured vowel dots
(see: Yasin Dutton).

That's correct; I shouldn't have said "originally". Use of the small nuun came after the colored dots. I'm trying to remember where I read this. I think if might have been "al-Nahw al-Wafiy". I did once ask my professor (Egyptian literary scholar) and he confirmed. I'll try to find the reference this weekend.


BTW, I designed a computer-aided, reversible transcription system (with
fall-back transliteration) which you can download for evaluation from Basis
Technology: http://www.basistech.com/arabic-editor/)


Looks interesting.  I'll have to take a look.

In that transcription your first sample reads as follows:

kitaabu-n (DMG: kitābu-n)

The qur'anic assimilation of second one is not yet supported, but it will
read like this:
khushubu- m:usannadätu-n (DMG: ḫušubu- m:usannadätu-n)
As you can see, initial compensatory shaddä is treated differently from
morphological shaddä.

Yes; this is an example where a very useful codepoint is unlikely to be endorsed by unicode. We could use two shaddas, one phonotactic and one lexical. I think there might even be a third case but I can't think of it at the moment.

What's the objection? It would be just as transparent as you solution.


I have to think some more about the paired vowels idea.

Anyway, I like your approach. If it is to find any acceptance, there needs
to be canonical equivalence with legacy encoding accoding to this formula:

TANWEEN                         =                 <vowel><small noon>
=      conventional tanween
TAMWEEM                         =                <vowel><small meem>
IDGHAM                              =                <vowel><idgham code>

But I wouldn't call it <small noon>; we want to retain the semantics of tanween explicitly in the encoding element so that software doesn't have to infer tanween based on two codepoints. This is the kind of thing I mean when I say intelligence should be migrated from software to the encoding as much as possible.

-g

Follow-Ups:
- Re: Proposal for the Basis of a Codepoint Extension toUnicodeforthe Encoding of the Quranic Manuscripts
  - From: Abdulhaq Lynch
- Re: Proposal for the Basis of a Codepoint Extension toUnicodeforthe Encoding of the Quranic Manuscripts
  - From: Thomas Milo

References:
- Re: Proposal for the Basis of a Codepoint Extension to Unicodeforthe Encoding of the Quranic Manuscripts
  - From: Mete Kural
- Re: Proposal for the Basis of a Codepoint Extension to Unicodeforthe Encoding of the Quranic Manuscripts
  - From: Abdulhaq Lynch
- Re: Proposal for the Basis of a Codepoint Extension toUnicodeforthe Encoding of the Quranic Manuscripts
  - From: Thomas Milo
- Re: Proposal for the Basis of a Codepoint Extension toUnicodeforthe Encoding of the Quranic Manuscripts
  - From: Gregg Reynolds
- Re: Proposal for the Basis of a Codepoint Extension toUnicodeforthe Encoding of the Quranic Manuscripts
  - From: Thomas Milo

Prev by Date: Re: Proposal for the Basis of a Codepoint ExtensiontoUnicodefortheEncoding of the Quranic Manuscripts
Next by Date: Re: Proposal for the Basis of a Codepoint Extension toUnicodeforthe Encoding of the Quranic Manuscripts
Previous by thread: Re: Proposal for the Basis of a Codepoint Extension toUnicodeforthe Encoding of the Quranic Manuscripts
Next by thread: Re: Proposal for the Basis of a Codepoint Extension toUnicodeforthe Encoding of the Quranic Manuscripts
Index(es):
- Date
- Thread