Mete Kural wrote:
I suggest not to use 649 since it is an unnecessary character - Farsi yeh covers it. IMHO it should not have entered Unicode in the first place, but it was probably carried over to Unicode from legacy ISO Arabic encoding.
such that it is Farsi and Classical Arabic - and possibly more - yeh).
<<2. 626 should be used. This will make it easier and more understandable, because we know what 626 is. If we encode it as 649 +
hamza above/below, someone might mistakenly think the 649 is alef maksura, which in this case, definately not.>>
I strongly suggest not to use 626 but rather use the seperate hamza above/below codepoint. This is better normalization of text. Besides you have to use a seperate small alef anyways. So use both a seperate hamza above/below and a seperate small alef for consistency. Did I tell you this was better for normalization? :)
<<3. Now, we are left with dotless yeh with small alef in the initial
and medial form. From previous mail, the suggestion was to use 649 +
670. Of course, visually, it is easy to tell that this is not alef maksura, but rather a dotless yeh serve as the chair for small alef. However, to develop an algorithm to search for it, it is not as easy/straight forward. I think that is why someone was sugesting to
me to use dotless ba instead of 649. Any suggestion?>>
Dotless beh is a non-starter for this purpose. It is what it is; it is a dotless "beh". It is intended for an archaic ambigious
Agreed.