[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Questions about yeh, hamzah on yeh, alef maksura and dotless ba



Greg,
If what you said about dotless yeh is true, then I think there is no
need for 0649 to represent it. Eventhough most of us agree here that
unicode does have problems with arabic, however, I think we should as
much be compliant with the description of the character itself, at
least. Lets forget about the naming for a while. 0649, even though
described as dual join, if you look further in other documents, they
do specify that the initial and medial forms of the character is not
used in arabic, but it is used for some other language (uighur etc). I
had the same impression as what you had last time about 649, but after
had the discussion with some other person outside arabeyes, I think he
is right. 649 is not the options. However, I do disagree with his
suggestion to use dotless ba for it, because I think it is not a ba.

Basically, you suggest to encode dotless yeh with hamza above/below
with 649 + hamza above/below, eventhough 626 is ok. And for dotless
yeh/ba initial/medial with small alef, encode 649 + small alef, am I
right? So, anyone else have some other opinion/suggestion on this?

However, there is one more thing. You did not answer my first question
regarding yeh final (64A). Seems like in the Madinah mushaf, all yeh
final appear to be without the dots. So, the question would be, how to
differentiate the difference between the 2, 649 and 64A, especially in
the final form? Is there any difference between the 2 other than the
appearance?

I think the idea from Mr Yousif was to use 64A throughout. The dots
will appear depending on the context. In final forms, no dots. In
medial /initial form, if it comes with hamza above/below or small
alef, then no dots. otherwise the should be a dots. What do you think?

Personally, I really thinks that we need a clear definition for at
least alef maksura. 649 either should exist with a good reason, or
should be deleted/not use. Unicode can leave the code there for
compatibility reasons, but recomend against the use of it.

On side note, unicode also have Farsi yeh. At first, I though it was
strictly for Persian language. But in their document, it does mention
Arabic, the language. The characteristic of Farsi yeh is, in
initial/medial forms, it exist with dots, otherwise, no dots. More
like what it appear in Madinah Mushaf. However, I think it should be
kept for Persian Language only.

Regards.

On 12/20/05, Gregg Reynolds <gar at arabink dot com> wrote:
> Meor Ridzuan Meor Yahaya wrote:
> > Dear all,
> ...
> >
> > 2. Seems like most referrence I could find saying that alef maksura
> > occurs only in final forms (for arabic, other languages does have
> > medial and initial form). I think the word ila (to) , uses alef
> > maksura (correct me if I'm wrong). However, when the words become
> > ilaina ( to us), it uses yeh, right? So, may I know which words uses
> > alef maksura, or it's rule?
>
> Rule number one:  disregard Unicode naming and go by character
> properties.  "Alef maksura" is incorrect, but Unicode refuses to correct
> names; instead they added an annotation indicating (incorrectly) that
> alef maksura "represents YEH-shaped letter with no dots in any
> positional form".  I.e. dotless yeh.  If you look at the bidi properties
> file, 0649 is classed as dual-joining.  So the *codepoint* represents
> dotless yeh; but it does *not* represent alef maksura.
>
> Alef maksura is not a letter form, it is a grammatical concept; it
> refers to a short vowel /a/.  Sometimes a dotless ya in final from is
> called alef maksura (e.g. ila إلى, rama رمى); sometimes a standard alef
> in final form is called alef maksura (da'a دعا for example).
>
> Which means "alef maksura" is not a good candidate for a code point;
> instead it should be "dotless yeh", but Unicode never changes names once
> they are approved.
>
> The general rule is that a suffix added to a final ya-shaped alef
> maksura causes it to take the shape of a standard alef.  Words like 'ala
> على and ila إلى are exceptions to this rule.  It's important to
> understand that the rule reflects phonology, not the written forms.  In
> other words, rama رمى becomes رماه ramAhu because the vowel changes from
> short to long.
>
> > 3. There are instances in quran where the small alef have chair. Last
> > time I encode it as alef maksura + small alef. I think for final alef
> > maksura is correct, but  it is wrong when it occurs in initial or
> > medial forms. Example: Sura 104 aya 5(initial) and sura 102 aya 1
> > (medial). I chose alefmaksura because it seems like it is alef.
> > However, from above, seems like alef maksura only occur in final form.
> > So, it should be a different character. So, what character it should
> > be? Someone suggest it is dotless ba. If it is true, can someone
> > explain, the use (description) of dotless ba? Unicode document does
> > not say much.
>
> dotless yeh 0649 is what you want.
>
> >
> > 4. Hamza on yeh (U+0626). Should I use this character or not?Mr Yousif
> > (Quran Project leader) against the use of it. Instead, suggest to use
> > yeh + hamza above/below. Because I want to make it compatible with
> > him, I uses alef maksura + hamza above/below. However, seems like it
> > is wrong. I choose alef maksura over yeh because to make it compatible
> > with other font (other font most likely to render the yeh with dots).
>
> I think he is right; use dotless ya + hamza above/below.  Unfortunately,
> you can't be certain that all fonts will include all four forms for
> dotless ya (a/k/a "alef maksura"), since that was a Unicode change;
> through Unicode v 3.0 "alef maksura" was classified as right-joining.
> Starting with update 1 to 3.0, it has been classified as dual-joining
> (http://www.unicode.org/Public/).  But any fonts that have been made
> after 3.0 update 1 should work.
>
> > The other problem is, in unicode document, it specifically mentioned
> > hamza "above". However, in the mushaf, it render the hamza depending
> > on context, either above or below. In my font implementation, the
> > hamza will appear depending on the mark, so no problem. However, other
> > font might not be "intelligent" enough. So, the question again, should
> > 0626 be use? I personally think it should be use. The problem of above
>
> My own preference is to have above/below codepoints for hamza and
> shadda; it will make searching a bit easier.  But I don't think there's
> anything wrong with using 0626.
>
>
> -gregg
>
>
> _______________________________________________
> General mailing list
> General at arabeyes dot org
> http://lists.arabeyes.org/mailman/listinfo/general
>