[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Quranic Proposal

To: General Arabization Discussion <general at arabeyes dot org>
Subject: Re: Quranic Proposal
From: abdulhaq <al-arabeyes at alinsyria dot fsnet dot co dot uk>
Date: Sun, 13 Jun 2004 11:57:21 +0100
User-agent: KMail/1.6.2
Dear Thomas

Thanks for your interesting reply. I've broken my reply into four points.

1- Characters vs glyphs

> To deal with tajweed as plain text, all the necessary characters _are_
> already in the Uncode standard. Please note that the sample text that
> Metin posted to illustrate that is already encoded in Unicode, complete
> with typo's :-), in the following way:

I want to elaborate the view that although the glyphs can be found there, the 
characters cannot.

> The fact that nobody knows that Unicode included these small meems
> specifically for rendering tajweed is the real weak point here. If a
> standard includes "obscure" specialist characters, it should leave a trail
> of documentation so that later implementors like you know how to use them.
> Creating clear documentation and instructions should be part of our
> present effort.

This is obviously vital.

Nevertheless, the different styles of dammataan and the إقلاب tanweens (with 
the meem) for instance _do_ carry real meaning. They seem to me to be _real 
characters_ (and therefore should be in the Unicode standard) and not merely 
combinations of glyphs. If a renderer wants (for whatever reason) to 
represent a dammataan that indicates إخفاء, why should it _have_ to use two 
adjacent dammas? No, the dammataan that carries one meaning should have one 
code, and the other dammataan another. Then the _font_ will provide the 
appropriate (single) glyph. 

As another example, I've read your pdfs on the internet and I think I agree 
with every single word, amongst which you say that ligatures do not belong 
in Unicode. I agree entirely. However, you yourself mention that even basic 
letters like baa' and taa' are ligatures in a certain way in that they are 
composed of the basal stroke and then the nuqaaT are added later. Does this 
mean that you are proposing that the baa' character should be removed and 
that we should have one code for the single-tooth stroke, one for the single 
dot, and that from now on to render a baa' the text must contain the tooth 
character followed by a nuqTa? Of course you don't mean that. I think the 
situation with the tajweed marks is exactly the same. They are generally 
composed of common glyphs but each tajweed mark individually has a meaning 
of its own and should have its own code point.

2 - Technological considerations

What you are proposing means the representation of one 'semantic load' as two 
other quite different characters (e.g. two dammas) that then need to 
replaced with a combined character to be rendered/placed by the renderer. 
Where and how will this be done? Will the arabic shaping code look out for 
two sequential dammas and then insert a new Unicode-type (but unofficial) 
code value known only to that implementation? This new character code would 
then be taken in the usual manner by the font renderer and output as, say, 
dammataan bi-ikhfaa'. Or, will the shaping code leave the sequential dammas 
where they are and leave it to the font renderer to work it out? 

I'm fairly sure from your previous emails that you intend implementations to 
adopt the second approach. However, as Muhammad implied, this means adding 
significant code to numerous code bodies over which we have no control and 
no knowledge either. On linux alone (never mind the Mac, Solaris, PalmOS, 
BSD, etc.) there are a number of font renderers in current use. Each of 
these is under the control of groups that will not be inclined to make these 
changes themselves and may not even let others change things (Apple and also 
the X consortium with their xfs springs to mind). Hence we all (everyone 
except arabic windows users) would be forced into  waiting for the 
diachronic page you mentioned, and who knows when that will arrive?

> The only possibility to accomplish a robust solution for encoding the
> Qur'an - or any Classical Arabic for that matter - in the Unicode format
> would be designing a code set from scratch and apply for it's inclusion in
> the second plain as Historic or Diachronic Arabic. This is exactly what I
> am working on, including the conversion schemes to upgrade the Arabic
> industrlal rubbish in Unicode or interchange with it.

This will be a good step forward, but these characters are not just historic 
but in widespread current use. Thinking about it, they are in one of the 
most common books in the world!

Is the Unicode standard ultimately some sort of Platonic ideal that cannot be 
violated by everyday pragmatism, or is it a tool to help people communicate 
and share precise information albeit in a non-perfect way?

3 - Phased approach

I concur with Nadim about correcting of the situation in two phases, the 
first which allows all operating systems (not just Windows) to output these 
extra characters without changes to core services such as font renderers. OK 
it may in your view be a bit of a kludge, but by your own understanding the 
whole thing is already a kludge. The second phase can get it _right_ in the 
ways you are proposing. Even then though I believe that the tajweed markers 
are characters with genuine meaning and not just glyph combinations.

4 - Industry View

> As a general rremark I would like to point out that, due to the by
> definition conservative character (sic!) of Industry Standards, there is
> no hope in the world of getting a structurally clean solution for Qur'anic
> Arabic - or even for Arabic at large. The reason is, that none of the
> Arabic encoding patterns or font designs were researched by and for
> scholars and calligraphers, but by employees of engineering companies with
> the short term commercial objective of arabizing as cheaply and as fast as
> possible whatever product they had that was originally made on the
> assumption that Latin characters rule the world. It was from the junk yard
> of trashed legacy code patterns that Unicode picked its Arabic code.

I take on board totally what you say about this being an industry body that 
has purely commercial interests. These characters/glyphs combinations, 
whatever you want to call them, relate only to qur'aan and islamic books  
(of which there are a huge number) which are almost entirely produced in the 
arab world which does not have a loud voice in these commercial global 
organisations.

Nevertheless it is in their interests to facilitate the production of these 
books, particularly bearing in mind the way the IT world is changing in the 
arab world. Copyright law is now being taken much more seriously and the 
time will come when people have to actually pay for products such as Windows 
and PageMaker. While perhaps they cannot get away without the DTP software, 
they would like to use Open Source (read: Free) operating systems. If the OS 
cannot support the output of these characters then it will hobble take up of 
paid-for software products. If these characters are not added then there is 
a strong chance that these OSes will never support the tajweed marks for the 
technical reasons I mentioned above and will not be able to be used for 
preparing a large body of books.

Although the chances of these changes as you say may be slim, we have a far 
greater hope with your support.  Please consider it.

wassalaam
abdulhaq


On Sunday 13 June 2004 08:35, Thomas Milo wrote:
> Dear Abdulhaq,
>
>
Follow-Ups:
- Re: Quranic Proposal
  - From: Thomas Milo
References:
- Re: Quranic Proposal
  - From: Nadim Shaikli
- Re: Quranic Proposal
  - From: abdulhaq
- Re: Quranic Proposal
  - From: Thomas Milo
Prev by Date: Re: Quranic Proposal
Next by Date: Re: Quranic Proposal
Previous by thread: Re: Quranic Proposal
Next by thread: Re: Quranic Proposal
Index(es):
- Date
- Thread