[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Questions about yeh, hamzah on yeh, alef maksura and dotless ba
- To: General Arabization Discussion <general at arabeyes dot org>
- Subject: Re: Questions about yeh, hamzah on yeh, alef maksura and dotless ba
- From: Gregg Reynolds <gar at arabink dot com>
- Date: Mon, 19 Dec 2005 22:38:56 -0600
- User-agent: Mozilla Thunderbird 1.0.2 (Windows/20050317)
Meor Ridzuan Meor Yahaya wrote:
Dear all,
...
2. Seems like most referrence I could find saying that alef maksura
occurs only in final forms (for arabic, other languages does have
medial and initial form). I think the word ila (to) , uses alef
maksura (correct me if I'm wrong). However, when the words become
ilaina ( to us), it uses yeh, right? So, may I know which words uses
alef maksura, or it's rule?
Rule number one: disregard Unicode naming and go by character
properties. "Alef maksura" is incorrect, but Unicode refuses to correct
names; instead they added an annotation indicating (incorrectly) that
alef maksura "represents YEH-shaped letter with no dots in any
positional form". I.e. dotless yeh. If you look at the bidi properties
file, 0649 is classed as dual-joining. So the *codepoint* represents
dotless yeh; but it does *not* represent alef maksura.
Alef maksura is not a letter form, it is a grammatical concept; it
refers to a short vowel /a/. Sometimes a dotless ya in final from is
called alef maksura (e.g. ila إلى, rama رمى); sometimes a standard alef
in final form is called alef maksura (da'a دعا for example).
Which means "alef maksura" is not a good candidate for a code point;
instead it should be "dotless yeh", but Unicode never changes names once
they are approved.
The general rule is that a suffix added to a final ya-shaped alef
maksura causes it to take the shape of a standard alef. Words like 'ala
على and ila إلى are exceptions to this rule. It's important to
understand that the rule reflects phonology, not the written forms. In
other words, rama رمى becomes رماه ramAhu because the vowel changes from
short to long.
3. There are instances in quran where the small alef have chair. Last
time I encode it as alef maksura + small alef. I think for final alef
maksura is correct, but it is wrong when it occurs in initial or
medial forms. Example: Sura 104 aya 5(initial) and sura 102 aya 1
(medial). I chose alefmaksura because it seems like it is alef.
However, from above, seems like alef maksura only occur in final form.
So, it should be a different character. So, what character it should
be? Someone suggest it is dotless ba. If it is true, can someone
explain, the use (description) of dotless ba? Unicode document does
not say much.
dotless yeh 0649 is what you want.
4. Hamza on yeh (U+0626). Should I use this character or not?Mr Yousif
(Quran Project leader) against the use of it. Instead, suggest to use
yeh + hamza above/below. Because I want to make it compatible with
him, I uses alef maksura + hamza above/below. However, seems like it
is wrong. I choose alef maksura over yeh because to make it compatible
with other font (other font most likely to render the yeh with dots).
I think he is right; use dotless ya + hamza above/below. Unfortunately,
you can't be certain that all fonts will include all four forms for
dotless ya (a/k/a "alef maksura"), since that was a Unicode change;
through Unicode v 3.0 "alef maksura" was classified as right-joining.
Starting with update 1 to 3.0, it has been classified as dual-joining
(http://www.unicode.org/Public/). But any fonts that have been made
after 3.0 update 1 should work.
The other problem is, in unicode document, it specifically mentioned
hamza "above". However, in the mushaf, it render the hamza depending
on context, either above or below. In my font implementation, the
hamza will appear depending on the mark, so no problem. However, other
font might not be "intelligent" enough. So, the question again, should
0626 be use? I personally think it should be use. The problem of above
My own preference is to have above/below codepoints for hamza and
shadda; it will make searching a bit easier. But I don't think there's
anything wrong with using 0626.
-gregg