[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Quranic Proposal - Logical Codes



On ثلاثاء 15 يونيو 2004 03:36, Mete Kural wrote:
>
> Now I'm give you another
> example from Arabic itself. The difference between
> beh, teh, and theh in Arabic is exactly the same as
> the difference between U and U umlaut in German. U and
> U umlaut are different letters, just as beh, teh, theh
> are and both cause a phonemic difference in the word.
>

  No, the German alphabet has 26 letters, the umlauts are not
  counted.
  Quoting from:
   http://en.wikipedia.org/wiki/German_alphabet
"
The German alphabet consists of the same 26 letters as the modern Latin 
alphabet:
a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z 
A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z 
(Listen to a German speaker cite the alphabet in German) 

The letter 'y' (ypsilon, /"ypsilOn/) occurs only in loan words in German.
"
 
  Then it defines  a letter+ an umlaut as 
{
The German language additionally uses three diacritic letters and one 
ligature:
ä, ö, ü / Ä, Ö, Ü 
ß (called es-tsett or scharfes s) 
(Listen to a German speaker naming these letters) 
Although the diacritic letters represent distinct sounds in German phonology, 
they are almost universally not considered part of the alphabet. Almost all 
German speakers consider the alphabet to have the 26 letters above and will 
name only those when asked to say the alphabet
}

  Notice that "listen to" phrase :-)
  and even that, these umlaut characters pronouncation is similar to the ones
  without an umlaut

 But Arabic has 28 alphabets so the comparison is not valid at all
 A Beh is not like a Teh in anyway (a Beh is read "B" and a Teh is read "T"
 so the difference is not only the meaning but also the pronouncation)

 From here:
   http://en.wikipedia.org/wiki/Umlaut

{
In linguistics, the process of umlaut (from German um- "around", 
"transformation" + laut "sound") is a modification of a vowel which causes it 
to be pronounced more to the front of the mouth to accommodate a vowel in the 
following syllable, especially when that syllable is an inflectional suffix. 
This process is found in many — especially Germanic — languages.
}

 It makes it clear that the difference is in pronouncation, they even can only
 be used with vowels.

> Don't you think that beh, teh, and theh deserve their
> own codepoint more than sequential fathatan would? Do
> you think that the difference between beh and teh is
> in the same category as the difference between
> fathatan and sequential fathatan?

  No, but in Unicode there are characters that differs only on
  the pronouncation and there even characters that are ligtures.
  If you think they stopped accepting ligatures then look at a newly
  accepted Arabic characters here:
   http://www.unicode.org/alloc/Pipeline.html
  One of them for example is "ARABIC LETTER REH WITH HAMZA ABOVE" !!!
  and it's accepted in 2004-Feb-04.

>
> > If the unicode body did not accept quranic symbols
> > as codepoints, then what
> > is the sajda mark doing there for instance? They
> > could have just said that
> > we should use two consecutive sukoons or some other
> > magic code sequence.
>
> The sajda mark is a single character and it indicates
> the end of a Quranic part as you know. 

  By your logic, it could have been done by using the Rub Hezb symbol
  with a logic code since they are both only indicators.

> There may be 
> slightly different shapes of the sajda mark in
> different Quran printings. Does that mean that we
> should have a different codepoint for each different
> variant sajda mark? So we only have one sajda mark and
> that's all we need to add to Unicode. 

 You forgot that a mushaf can only use ONE shape for the sajda
 mark but it uses TWO shapes for fathatan and they maybe in the
 same verse, there is a great difference here.

> But sequential 
> fathatan, sequential dammatan, sequential kasratan are
> three characters and they are simply variants of
> fathatan, dammatan, and kasratan respectively.

  Unicode accepts variances of characters.
    see here: http://www.unicode.org/alloc/Pipeline.html
    in the bottom you should see a table of accepted
   Variation Sequences.

> Imagine 
> how it would be if the users of the 20+ other
> languages which use the Arabic script came and
> requested to add all kinds of little script-specific
> nuances found in their languages to Unicode. Then
> there would be no more space left in the Arabic code
> block. Arabic Unicode has largely been stabilized. It
> is already really hard to add new codepoints. Adding
> three more new codes for variants of characters that
> are already part of the Arabic code block may be met
> with great resistence. These codepoints are valuable
> since there is a limited number of them. So Unicode
> will probably be very conservative in this regard.
>

  Why then it's very Bounteous in accepting characters such as:
    ARABIC TRIPLE DOT PUNCTUATION MARK
    ARABIC VOWEL SIGN DOT BELOW
    ARABIC FATHA WITH TWO DOTS (a variation of fatha, and in your logic, it
                                                    can be done by logic code)
    ARABIC LETTER LAM WITH BAR
    ARABIC LETTER REH WITH TWO DOTS VERTICALLY ABOVE 
    ARABIC LETTER REH WITH HAMZA ABOVE
    ARABIC LETTER SEEN WITH TWO DOTS VERTICALLY ABOVE

 While it can't add the very important tanween types because the 'limited
 number of codepoints'


-- 
Mohammed Yousif
Egypt