[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Questions about yeh, hamzah on yeh, alef maksura and dotless ba



On Saturday 31 December 2005 18:57, Thomas Milo wrote:
>
> I am not sure what you are describing here. If you mean the explicit /y/ or
> /ii/ function as  expressed by two dots, then I understand. For that sort
> of problems IMHO the best theoretical solution would be to consider the dot
> patterns (not just for yeh, but _all_ five of them) as separate graphemes
> entitled to their own code point in Unicode. If that principle were
> extended to all Arabic-derived characters, then that would also simplify
> font design for Unicode dramatically. A couple of years ago I described
> this in a well-established scienfific journal:
>
> www.decotype.com/publications/Manuscripta_Orientalia.pdf
>

If you mean encoding the dots separately but _still_ maintaining one codepoint
for every Arabic letter then I identify this as being one way to handle it
although I would say that it is an overkill and can cause confusion.
For example, a user might type a Qaf and One Dot and then think that he typed
a Feh and uses it as a such. or might type a Heh and Two Dots and then think
that he typed a Teh which is not the case.

But if you mean encoding a Theh for example as Plate + Three Dots and encoding
a Beh using the same codepoint Plate + One Dot, then I disagree completely.


> >> The word /stay'asuw/ in Q12:80 is rather a spanner in the works: its
> >> existence implies that there can be no rule that the sequence Yeh,
> >> Hamza can be trusted to be Yeh+Hamza_above/below.
> >
> > The well established Qur'an sciences can be employed to know if the
> > hamza is above/below or standalone.  Namely, the Rasm science, it
> > disambigu clearly this type of situations and identifies the various
> > variations that can exist with other types of Masahef "Maghribi
> > Mushaf...etc".
>
> If it's not a simple straightforward rule, it cannot be expected to be
> built into a font. So our earlier idea of assuming that the string (any)
> YEH followed by hamza could be substituted by a single - ligature! -
> YEH+HAMZA (above or below according to Qur'anic rules and locale) turns out
> to be false.
>

Now I understand what you mean. You want to be able to use one Hamza codepoint
for both the standalone Hamza and the HamzaAbove/Below mark.

Well, not only Hamza possess this behavior but also more Arabic letters.

To give and example, the Hamza situation here is exacly like the Seen
situation.

Just to be clear:
 Hamza U+0621                HamzaAbove U+0654
 Seen    U+0633                SeenAbove    U+06DC

Hamza can come standalone and Seen can come standalone, in that case the
effect is adding one more letter to the word (Hamza or Seen) and the Hamza or
Seen becomes a part of the spelling of the word.
The word باء (Beh,Alef,Hamza) for example is three letters long, the
Hamza is counted because it is a separate letter here that doesn't affect in
the different letter Alef.
And the word شيء (Sheen,Yeh,Hamza). Three letters, the Hamza is
a separate letter than the Yeh and has no effect on the Yeh. And the word
is spelled using the three letters' names (Sheen,Yeh,Hamza).

Also, the word مسيطرون (Meem,Seen,Yeh,Tah,Reh,Waw,Noon) is seven letters.
The Seen here is counted because it's a seprate letter. It doesn't affect any
other letter in the word.

But Hamza and Seen can also come "attached" to other letters acting as a mark.
In that case the effect isn't adding one more letter to the word but only
affecting the way one might think about the letter which they are attached to.
In this case (let me call them HamzaAbove and SeenAbove), their meaning can be
thought of as an alert to the reader that "Beware! the letter which
HamzaAbove/SeenAbove wasn't really that letter, it was a Hamza/Seen that has
been replaced and you should pronounce them using the Hamza/Seen sound not
the sound of the underlying letter".

As an example, the word بأي (Beh,Alef,HamzaAbove,Yeh) is three letters long
not four, that is because the HamzaAbove here is a mark above the Alef that
alerts the reader that this Alef was actually a Hamza and shoule be pronounced
with the Hamza stopping sound /'a/ not the Alef Madd 2-harakas vowel sound
"aa".

Also, the word أرائك (Alef,HamzaBelow,Reh,Alef,Yeh,HamzaAbove,Kaf) is five
letters long for the same reason as above. When spelled letters by letter,
it's spelled (Alef,Reh,Alef,Yeh,Kaf) igonring HamzaAbove and HamzaBelow, since
they are marks attached to other letters not letters themselves.
The HamzaBelow that comes after the Yeh here means that the Yeh is only a seat
for the Hamza and should be pronounced with the Hamza stopping sound /'a/ not
the Yeh Madd 2-harakas vowel sound "ee".
When spelling indiviually, this word is spelled:
(Alef,Reh,Alef,Yeh,Kaf) or even (AlefMahmoza,Reh,Alef,YehMahmoza,Kaf).

Also, the word مصۜيطرون (Meem,Sad,SeenAbove,Yeh,Tah,Reh,Waw,Noon) is seven
letters long not eight.
The SeenAbove here is not counted because it's a mark that alerts the
communicates to the reader the fact that this Sad was actually a Seend and
thus should be pronounced with the Seen sound "s" not the Sad sound "the
german sound ß". When spelled individually, it's:
(Meem,Sad,Yeh,Tah,Reh,Waw,Noon).

In short, StandaloneHamza is something and HamzaAbove/Below is something
else in the same sense that Seen is something and SeenAbove is something else.
So it's logical to use different codepoints for StandaloneHamza and
HamzaAbove/Below.

Falls under the same domain are more letters:
 - Yeh/SmallYeh and YehAbove (YehAbove can be seen above Alef in Warsh
    Mushaf, Maghribi style. Here the Alef acts as a seat for the Yeh).
 - Alef/SmallAlef and AlefAbove (AlefAbove can be seen above Yeh,Waw).


Anyway, I remember you proposed a workaround for using one codepoint for both
SmallAlef and SmallAlefAbove. You can also use the same workaround for using
one codepoint for both Hamza and HamzaAbove/Below.

The workaround depends on the the fact that modern Masahef are fully marked.
The idea is that if Alef/SmallAlef comes after a letter, that letter is
certainly marked with a haraka or something. But if SmallAlefAbove comes after
a letter,  there will be no marks between the SmallAlefAbove and that letter
because the SmallAlefAbove is "attached" to that letter which acts as
a seat and of course the marks for that letter comes after the SmallAlefAbove
mark.

The same can be done for Hamza and even for Seen.
That is, if for example a Seen comes after a letter "say Sad", then certainly,
the  marks associated with the Sad would come before the Seen. That's because
the Seen here is separate from the Sad and doesn't affect it. But if SeenAbove
comes after the Sad, there will be no marks between the Sad and the SeenAbove
because, here the SeenAbove _is_ the first mark that should be above the Sad.


However, I highly reject this type of workarounds because:
 - they make no distinction between the concept of a Standalone letter that
    has nothing to do with the other letters in the word and the concept of
    a combining mark associated with another letter.
 - They depend on the text being _accurately_ fully marked which is not the
    case in most existing texts. And as a consequnce, it would be impossible
    for the reader or software to know the meaning of the given character
    (woudn't be able to tell the difference between a letter and a vowel mark)
    and as such would make searching and other text processing tasks very
    hard and inaccurate.

>
> I admire Meor's efficiency in creating a first workable Qur'an using
> Unicode and OpenType components. But there are still a couple of open ends
> that are not his fault, but that are the consequence of font technological
> limitations.
>

I have given up using an OpenType font in custom Qur'an application.
This is because I'm not forced to use OpenType fonts or any other font for
that matter since I in a custom app I have control over how text is being
drawn.

-- 
Mohammed Yousif
Egypt

"قال قائل منهم إني كان لي قرين. يقول أءنك لمن المصدقين. أءذا متنا وكنا تراباً 
وعظاماً أءنا لمدينون. قال هل أنتم مطلعون. فاطلع فرءاه في سواء الجحيم. قال
تالله إن كدت لتردين. ولولا نعمة ربي لكنت من المحضرين"  (من القرءان الكريم)