[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Questions about yeh, hamzah on yeh, alef maksura and dotless ba



Hello, there.

Let's remember that desirable yeh is absent in Unicode now.
Thus, it is not the problem which, i.e. U+0649 ("yeh"), 
U+064A("alif maqsura"), and U+06CC ("farsi yeh"), to choose among
*them*, but to choose the desirable *behavior* from what they have,
(Or attribute would be the better word? I guess you know what I mean.)
in order to add the apropriate modification to the Unicode standard.
In this line, Farsi Yeh-like one looked most promising. At least, it
should be modified to lose the dots when accompanied with hamza 
over/under or small alif, right?

There still remains ambiguity, though. First I admit it is partly 
due to my lack of classic Arabic knowledge. Now:

1. Thomas Milo proposed as a "less ambitious approach", where unique 
"yeh" is used throughout, and it should drop dots at the final
position under Qur'anic locale. If this is to be adopted, I could not
understand the following point: today's usage sometimes require both 
the dotted and un-dotted yeh at the final, so one yes does not
suffice.  Is this solution limited to meet Meor's need?

2. On the above "farsi yeh loses points with hamza/small alif" suggestion.
It looks natural for today's texts, where the orthography is I believe
well-established. Now what about the situation of classic materials
(not limited to Qur'aan)? In the relevant era, is it always the case that 
hamza or small alif are acutually written with dotless yeh, while
ini/mid yeh which represents "y" consonant has two dots? Are there no
occurrence of "unwritten hamza" with dotless yeh? (If I remember
correctly, hamza is invented later by headhunting `ain.)
Is Tom's dots codepoints necessary in this case, too?

What I can suggest to Meor is to be at ease for the time being, not in
haste. The final solution is yet to come, since there's no codepoint of
dots nor unified true yeh in the current Unicode standard.
If you keep your policy is clear and consistent, it is easy to filter the
text later. (And I think yours so far is practically good. Any policy
is OK, but yours, visual coincidence, is understood straightforward.)

And to your first question: are there any clear criterion for final
yeh to be dotted or not? I don't think you already have received a
clear-cut answer. For cotemporary writing which allows final dotted
yeh, no. You should understand each word. For Qur'aan, yes, your guess
is correct.

Now since I'm far from being expert, I propose another bold solution:
(I suppose it must have been taken into account in early days of
Unicode, but I don't know. I know it can never be merged into Unicode.)
Totality of the encoding elements be ini/mid/fin/isol forms of letters.
They form true graphemes. 
I dare not call them representation forms in this definition.
Today, "letters" are considered to be elemental, and shaped forms are
used behind user interface. What I propose is to consider letters to
be virtual, or transparent. They get bound to keys, and texts are
encoded with actual shape elements. If bare letters were included,
it's illegal.
It then forces yeh-hamza to be dotless, since there's no such thing
"dotted-yeh-hamza". At key binding level, overwrap of dotless yeh +
hamza and dotted yeh + hamza is allowed. They are encoded equally.
Joiner and non-joiner are also virtual and merely key binding
candidates, etc etc...


By the way, does anyone know this? I know "alif saghiirah" means the
small alif. Are there direct translation of "dagger alif" in Arabic?
I once guessed that since it resembles to dagger sign which indicates
footnote, the word "dagger alif" was coined by a european orientalist...

Thank you all. Good day.

"Oibane"
pflm52td at wsitta.dion.ne.jp
# sitta is 6