[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Volunteers for verifying the quran data

To: General Arabization Discussion <general at arabeyes dot org>
Subject: Re: Volunteers for verifying the quran data
From: Gregg Reynolds <gar at arabink dot com>
Date: Wed, 29 Jun 2005 10:05:08 -0500
Cc: "Bernard S. Greenberg at Basis" <bsg2004 at basistech dot com>, Tom Patterson <pattersont at summa dot com>, Zina Saadi <ZinaS at basistech dot com>
User-agent: Mozilla Thunderbird 1.0.2 (Windows/20050317)

Thomas Milo wrote:

I agree with Mete. This concept of encoding root morphemes separately from
other Arabic letters, if ported to Indo-European languages (much more

...etc...

Hi,

I'm a bit busy at the moment so I can't respond in detail. For the moment all I ask is that you keep an open mind. When I recommend ignoring Unicode, I mean ignore it *first*, while you are designing an encoding to meet the needs of a linguistic community; *then* think about how your encoding can be accomodated by Unicode.

Regarding morphemic encoding: I assume you're talking about the notion of radical/non-radical pairs as codepoints. In my view, these need not be considered morphemes. (Turning it around, one could argue that all Arabic consonants are morphemic.) E.g., assume CAPS are radical characters and small letters are non-radicals. Then the K, T, and B in "maKTaB" are not morphemes; they're just letters with radical semantics. Not so different from encoding both uppercase and lowercase forms for latin-based scripts. One could spell the same word "maktab"; both would look the same after rendering, but the former allows use to use ordinary software to do interesting things (e.g. find all words derived from KTB). No need for morphological analysis software.

Now, IMO a difficult design question is whether some true morphemes should in fact be encoded. Obvious examples: definite article, other particles like laa, sawfa, sa-, direct object suffixes -hu, -ha, etc. Unicode will never countenance something like that, but that doesn't mean we shouldn't. Such design decisions should be made strictly on a costs/benefits basis, IMO.


Even then, Arabic script does not fully cover the Arabic language from a
linguistic perspective. A (or maybe /the/) striking example is the inserted
vowel between the /n/ of tanween and any initial cluster of consonants,
e.g., /muHammadu-ni r-rasuulu/: it has no orthographic expression (I found
it described as kasra, bound to a small nuun in an Ottoman handbook, but I
never attested it in a manuscript).


(I think you mean /muHammadu-nu r-rasuulu/ ;)

I don't understand your argument here. The "helper vowel" can be inscribed using one of the ordinary vowel marks. (I'm pretty sure the grammarians address this explicitly.) Scribes may choose not to do this, but they can if they want to. This occurs in many cases, e.g. after the question particle hal.

-g

Follow-Ups:
- Re: Volunteers for verifying the quran data
  - From: Thomas Milo
- Re: Volunteers for verifying the quran data
  - From: heer

References:
- Re: Volunteers for verifying the quran data
  - From: Mete Kural
- Re: Volunteers for verifying the quran data
  - From: Thomas Milo

Prev by Date: Re: General Digest, Vol 18, Issue 31
Next by Date: Re: Volunteers for verifying the quran data
Previous by thread: Re: Volunteers for verifying the quran data
Next by thread: Re: Volunteers for verifying the quran data
Index(es):
- Date
- Thread