[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Quranic Proposal



On أحد 13 يونيو 2004 10:35, Thomas Milo wrote:
> Dear Abdulhaq,
>
> > This would mean the acceptance of items 1 - 6, 9, 10, 11 and 12. If you
>
> don't
>
> > want them in the unicode standard, then they will have to be dynamically
> > inserted by the rendering algorithm, which as I am saying is very prone
> > to error.
>
> Dynamically inserting tajweed is theoretically very well possible, but it
> is not part of our present discussion.
>
> To deal with tajweed as plain text, all the necessary characters _are_
> already in the Uncode standard. Please note that the sample text that Metin
> posted to illustrate that is already encoded in Unicode, complete with
> typo's :-), in the following way:
>
> 1-3 "Tangweeng" was encoded by repeating the single vowel character: 1.
> fatha fatha 2. damma damma 3. kasra kasra.
>

  How can we expect the rendering engine to differentiate it from the regular
  tanween given that it's encoded in the same way "fatha fatha"?
  If you mean that the rendering engine should determine that from the context
  then this rendering engine will only be suitable for displaying the Qur'an
  and it will break non-Qur'anic Arabic text.
  Even if the rendering engine is yet more intelligent and it knows how to
  identify the Qur'an text and deal with it in a different way than regular 
  Arabic text, it wouldn't be possible to convince the people behind the
  currently used rendering engine to add something like that.
  And even if they added this very intelligent algorithm (very unlikely),
  how can a teacher write a tajweed exam for his/her students which
  consist of wrongly inserted tajweed signs that the student has to
  correct?


> 4-6 "Tamweem" is encoded by adding 06E2 arabic small meem isolated form to
> the the single vowel character.
>
> The necessary miniature meem is available in Unicode as 06E2 arabic small
> high meem isolated form. Beware that this is a misnomer, since a single
> 06E2 arabic small isolated form would have sufficed (together with the
> corresponding 06ED arabic small low meem isolated form these characters
> were specifically included for dealing with this kind of tajweed; 06ED
> small low meem is a contextual variant of 06E2 small high meem triggered by
> kasra; and should therfore be ignored.
>

  But that contradicts with the standard behavior, which requires a small high
  meem to be drawn on top of the previous character and which requires a
  fatha for example to be stacked on top of both.
  How can one convince rendering engines vendors to allow for this behavior?

> The fact that nobody knows that Unicode included these small meems
> specifically for rendering tajweed is the real weak point here.

  I know of no one who uses the meem outside of Qur'an related texts,
  it's really clear that it's only used for encoding the Qur'an text.

> If a 
> standard includes "obscure" specialist characters, it should leave a trail
> of documentation so that later implementors like you know how to use them.
> Creating clear documentation and instructions should be part of our present
> effort.
>
> Regular tanween, the phonetically neutral nunation, is written with the
> inverted dammatan ligature and vertically stacked fathatan/kasratan similar
> to those used for MSA. A request from Unicode to change the - IMHO
> erroneous - practice to use a single ligature to represent final vowel
> redupiclation would meet with massive resistance from the industry.
>

  There are cases where this is really needed "dammatan for example cannot be
  encoded as damma damma".
  And about the issue with sequential fathatan/regular fathatan, if we have
  to use the characters that exist today, they would *both* be encoded
  as "fatha fatha" and that returns us to your recommendations of using
  text rendering technologies for that but as I explained above, this
  will not be really usefull since it will only be implemented in special
  rendering engines made specifically for the Qur'an, the things that all
  vendors will refuse to add to their rendering engines (which are used
  currently). That's why I think that any special rendering engine for
  the Qur'an will not be usefull.

> 9 -- 12 are redundant. They follow the erroneous precedent of encoding
> glyphs (which is discouraged by the Unicode Consortium):
>
> U+FD3C Arabic Ligature Alef with Fathatan Final Form
> U+FD3D Arabic Ligature Alef with Fathatan Isolated Form
>
> Both these ligatures is described in the standard as the contextual
> Presentation Form for rendering U+0627 Alif and U+064B Fathatan. The latter
> two characters are the real Unicodes to be used in plain text. The proposed
> ligatures FD40-43 would also have to be defined in terms of plain text
> equivalents. 

 Without FD3D, how can one encode the word (35 "Ghafir" verse 27):  مختلفاً
 if we used "feh fatha fatha alef", it would be:
   + Not correct, the fathatan belong to the alif not the feh.
   + Different from the real word as seen on the
      hard-copy of the Qur'an.
 if we used "feh alef fatha fatha", it would be:
   + Very different from the real word as seen on the
      hard-copy of the Qur'an as the fathatan is shifted
      a bit to the right.

> However, ligatures do not belong in the Unicode Standard, 
> their inclusion was a mistake and no more ligatures will be added in the
> future.
>

  Does that mean that we should stop discussing them and move on to
  something else?

> It is remarkable that nobody ever notices the blatant mistake in the
> prescribed equivalence for these ligatures as U+0627 Alif and U+064B
> Fathatan. This is the wrong order for encoding tanween in Classical and
> Qur'anic Arabic - and IMHO contemporary Arabic as well. It should have been
> U+064B Fathatan followed by U+0627 Alif - a TRAILING ALIF. Including
> typographic errors like this in an Industry Standard is the hallmark of
> engineers doing script encoding instead of scholars.
>

  I'm not sure if you are asking for a change in the Qur'an itself rather
  than in the Unicode because it includes tanween in the wrong order or
  not but don't expect Muslims to change the Qur'an because it's wrong.
  If either the Qur'an or Unicode have to be changed, then it's certainly
  Unicode because it was created to encode text (including the Qur'an) not
  the inverse (i.e. the Qur'an wasn't there to comply with Unicode, Unicode
  was there to comply with the Qur'an and other texts).
  Unicode should allow people to encode the Qur'an exactly as they see in
  the hard-copy.

  About contemporary Arabic, it's Unicode's duty to allow people to type text
  that conforms to the rules of Arabic which in turn is specified and
  maintained by Arabic(noun not adjective) organizations like "Mogammah
  Allogha Al-Arabia".
  It's not Unicode that dictates the rules of Arabic and if it did, it would
  be not usable at all in all Arabic countries (You cannot make a formal
  document, for example, that has Arabic rules differing from the currently
  used Arabic and expect it to be accepted).

  This is not to say that I'm against the need to be able to use old
  Arabic, but I'm merely saying that contemporary Arabic is much
  more important and if one has to choose between them, it is not
  appropriate to choose old Arabic.
  

> As a general rremark I would like to point out that, due to the by
> definition conservative character (sic!) of Industry Standards, there is no
> hope in the world of getting a structurally clean solution for Qur'anic
> Arabic - or even for Arabic at large.

  But we have to try to do that and not just say that it is not
  possible and there is nothing we can do.
  I'm sure you agree with me here.

> The reason is, that none of the 
> Arabic encoding patterns or font designs were researched by and for
> scholars and calligraphers, but by employees of engineering companies with
> the short term commercial objective of arabizing as cheaply and as fast as
> possible whatever product they had that was originally made on the
> assumption that Latin characters rule the world. It was from the junk yard
> of trashed legacy code patterns that Unicode picked its Arabic code.
>

  This is changing now, a lot of scholars now have some kind of computer
  knowledge and they can do a lot in this area.
  But even if there aren't any, we can still request their help.
  Actually, this proposal is based on their suggestions/help either using
  their writings or asking them directly.

> The only possibility to accomplish a robust solution for encoding the
> Qur'an - or any Classical Arabic for that matter - in the Unicode format
> would be designing a code set from scratch and apply for it's inclusion in
> the second plain as Historic or Diachronic Arabic. 

  But today, contemporary Arabic is used, not historic Arabic

> This is exactly what I 
> am working on, including the conversion schemes to upgrade the Arabic
> industrlal rubbish in Unicode or interchange with it.
>

  I'm afraid your goal "which I really appreciate btw" is different than mine.

  My goal is to make Unicode able to render the hard-copy of the Qur'an
  certified by all related Islamic organizations today.
  My goal is to make Unicode compliant with the current Arabic so that we
  can use it.

  For example, the small letters issue, I can send you a scan of a snippet
  from the last pages of most Qur'an printings that is popular today.
  You will notice that the first small letter noted is small alif followed by
  the rest of those small letters.
  You will also notice that the small yeh which is encoded in Unicode as
  U+06E6 is not only used as a trailing yeh but also can be used in
  the middle of words.
    106 "Quraish" verse number 2:   إِۦلَـٰفِهِمۡ
  As you can see, it's not a word-final only character.

  But I can understand now that you are not considering the hard-copy
  used today standard and as such, your goal is different from mine.


-- 
Mohammed Yousif
Egypt