[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Quranic Proposal
- To: General Arabization Discussion <general at arabeyes dot org>
- Subject: Re: Quranic Proposal
- From: abdulhaq <al-arabeyes at alinsyria dot fsnet dot co dot uk>
- Date: Sun, 13 Jun 2004 11:57:21 +0100
- User-agent: KMail/1.6.2
Dear Thomas
Thanks for your interesting reply. I've broken my reply into four points.
1- Characters vs glyphs
> To deal with tajweed as plain text, all the necessary characters _are_
> already in the Uncode standard. Please note that the sample text that
> Metin posted to illustrate that is already encoded in Unicode, complete
> with typo's :-), in the following way:
I want to elaborate the view that although the glyphs can be found there, the
characters cannot.
> The fact that nobody knows that Unicode included these small meems
> specifically for rendering tajweed is the real weak point here. If a
> standard includes "obscure" specialist characters, it should leave a trail
> of documentation so that later implementors like you know how to use them.
> Creating clear documentation and instructions should be part of our
> present effort.
This is obviously vital.
Nevertheless, the different styles of dammataan and the إقلاب tanweens (with
the meem) for instance _do_ carry real meaning. They seem to me to be _real
characters_ (and therefore should be in the Unicode standard) and not merely
combinations of glyphs. If a renderer wants (for whatever reason) to
represent a dammataan that indicates إخفاء, why should it _have_ to use two
adjacent dammas? No, the dammataan that carries one meaning should have one
code, and the other dammataan another. Then the _font_ will provide the
appropriate (single) glyph.
As another example, I've read your pdfs on the internet and I think I agree
with every single word, amongst which you say that ligatures do not belong
in Unicode. I agree entirely. However, you yourself mention that even basic
letters like baa' and taa' are ligatures in a certain way in that they are
composed of the basal stroke and then the nuqaaT are added later. Does this
mean that you are proposing that the baa' character should be removed and
that we should have one code for the single-tooth stroke, one for the single
dot, and that from now on to render a baa' the text must contain the tooth
character followed by a nuqTa? Of course you don't mean that. I think the
situation with the tajweed marks is exactly the same. They are generally
composed of common glyphs but each tajweed mark individually has a meaning
of its own and should have its own code point.
2 - Technological considerations
What you are proposing means the representation of one 'semantic load' as two
other quite different characters (e.g. two dammas) that then need to
replaced with a combined character to be rendered/placed by the renderer.
Where and how will this be done? Will the arabic shaping code look out for
two sequential dammas and then insert a new Unicode-type (but unofficial)
code value known only to that implementation? This new character code would
then be taken in the usual manner by the font renderer and output as, say,
dammataan bi-ikhfaa'. Or, will the shaping code leave the sequential dammas
where they are and leave it to the font renderer to work it out?
I'm fairly sure from your previous emails that you intend implementations to
adopt the second approach. However, as Muhammad implied, this means adding
significant code to numerous code bodies over which we have no control and
no knowledge either. On linux alone (never mind the Mac, Solaris, PalmOS,
BSD, etc.) there are a number of font renderers in current use. Each of
these is under the control of groups that will not be inclined to make these
changes themselves and may not even let others change things (Apple and also
the X consortium with their xfs springs to mind). Hence we all (everyone
except arabic windows users) would be forced into waiting for the
diachronic page you mentioned, and who knows when that will arrive?
> The only possibility to accomplish a robust solution for encoding the
> Qur'an - or any Classical Arabic for that matter - in the Unicode format
> would be designing a code set from scratch and apply for it's inclusion in
> the second plain as Historic or Diachronic Arabic. This is exactly what I
> am working on, including the conversion schemes to upgrade the Arabic
> industrlal rubbish in Unicode or interchange with it.
This will be a good step forward, but these characters are not just historic
but in widespread current use. Thinking about it, they are in one of the
most common books in the world!
Is the Unicode standard ultimately some sort of Platonic ideal that cannot be
violated by everyday pragmatism, or is it a tool to help people communicate
and share precise information albeit in a non-perfect way?
3 - Phased approach
I concur with Nadim about correcting of the situation in two phases, the
first which allows all operating systems (not just Windows) to output these
extra characters without changes to core services such as font renderers. OK
it may in your view be a bit of a kludge, but by your own understanding the
whole thing is already a kludge. The second phase can get it _right_ in the
ways you are proposing. Even then though I believe that the tajweed markers
are characters with genuine meaning and not just glyph combinations.
4 - Industry View
> As a general rremark I would like to point out that, due to the by
> definition conservative character (sic!) of Industry Standards, there is
> no hope in the world of getting a structurally clean solution for Qur'anic
> Arabic - or even for Arabic at large. The reason is, that none of the
> Arabic encoding patterns or font designs were researched by and for
> scholars and calligraphers, but by employees of engineering companies with
> the short term commercial objective of arabizing as cheaply and as fast as
> possible whatever product they had that was originally made on the
> assumption that Latin characters rule the world. It was from the junk yard
> of trashed legacy code patterns that Unicode picked its Arabic code.
I take on board totally what you say about this being an industry body that
has purely commercial interests. These characters/glyphs combinations,
whatever you want to call them, relate only to qur'aan and islamic books
(of which there are a huge number) which are almost entirely produced in the
arab world which does not have a loud voice in these commercial global
organisations.
Nevertheless it is in their interests to facilitate the production of these
books, particularly bearing in mind the way the IT world is changing in the
arab world. Copyright law is now being taken much more seriously and the
time will come when people have to actually pay for products such as Windows
and PageMaker. While perhaps they cannot get away without the DTP software,
they would like to use Open Source (read: Free) operating systems. If the OS
cannot support the output of these characters then it will hobble take up of
paid-for software products. If these characters are not added then there is
a strong chance that these OSes will never support the tajweed marks for the
technical reasons I mentioned above and will not be able to be used for
preparing a large body of books.
Although the chances of these changes as you say may be slim, we have a far
greater hope with your support. Please consider it.
wassalaam
abdulhaq
On Sunday 13 June 2004 08:35, Thomas Milo wrote:
> Dear Abdulhaq,
>
>