[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Quranic Proposal
- To: "General Arabization Discussion" <general at arabeyes dot org>
- Subject: Re: Quranic Proposal
- From: "Thomas Milo" <t dot milo at chello dot nl>
- Date: Sun, 13 Jun 2004 14:27:53 +0200
Dear Abdulhaq,
1- Characters vs glyphs
> Nevertheless, the different styles of dammataan and the إقلاب tanweens
(with
> the meem) for instance _do_ carry real meaning. They seem to me to be
_real
> characters_ (and therefore should be in the Unicode standard) and not
merely
> combinations of glyphs.
In my analysis the different styles of tanween should not be considered
separate characters, but contextually conditioned modulations of the basic
tanween. I have two reasons for this.
The first is that the visual variations correspond with phonetic, i.e.
linguistically meaningless, variation, as opposed to phonemic, i.e.,
linguistically meaningful variation. After all, the affected words do not
change in meaning, they are just anticipating phonetically on the next
consonant.
The second reason is that I am in favour of looking at the larger picture of
Islam as a great civilization and community, within which traditions coexist
that are minimally different. For instance, Mamluk, Persian, Indian and
Ottoman Qur'ans only use regular tanween, even where the contemporary Arabic
Qur'an uses the specialized tanween variation presently under discussion.
We would do well to define qur'anic use of Unicode in such a way that the
essential singularity of the text is maintained. The way to do this is to
use mark-up or specialized characters like SMALL MEEM to follow regular
tanween to indicate a phonetic modulation
(please note that my preceding email decribes a solution for tamweem etc.
that does not yet implement this idea. Seen in this light, my present
solution has a trait in common with your approach - it is pragmatic by using
what is available in the way of code points and feasible is the way of font
technology. After all: it works.)
> If a renderer wants (for whatever reason) to
> represent a dammataan that indicates إخفاء, why should it _have_ to use
two
> adjacent dammas? No, the dammataan that carries one meaning should have
one
> code, and the other dammataan another. Then the _font_ will provide the
> appropriate (single) glyph.
The use of repeated damma/fatha/kasra is just an example of how it could be
done. For modern font technology internal glyph substition is a trivial
matter. Internally two Unicodes can be made a single glyph (=ligature) or
one Unicode can make many glyphs (for instance multiple pen strokes to build
one letter).
As I indicated above, I believe the cleanest way to handle tanween variation
is to use one and the same solution for basic tanween, followed by a
modulation character, one for tamween (already available and not yet
unambiguously defined in Unicode) and another, new code for rendering them
sequentially.
> As another example, I've read your pdfs on the internet and I think I
agree
> with every single word, amongst which you say that ligatures do not belong
> in Unicode. I agree entirely. However, you yourself mention that even
basic
> letters like baa' and taa' are ligatures in a certain way in that they are
> composed of the basal stroke and then the nuqaaT are added later. Does
this
> mean that you are proposing that the baa' character should be removed and
> that we should have one code for the single-tooth stroke, one for the
single
> dot, and that from now on to render a baa' the text must contain the tooth
> character followed by a nuqTa?
This is in fact exactly how I analyse Arabic script and why I consider the
existing legacy code industrial trash. However, in our present discussion we
are looking for ways to make the best of the existing Arabic block in
Unicode.
2 - Technological considerations
> What you are proposing means the representation of one 'semantic load' as
two
> other quite different characters (e.g. two dammas) that then need to
> replaced with a combined character to be rendered/placed by the renderer.
Having agreed on the plain text coding format and conventions, the renderer
can apply any substitution it deems necessary. This can be internal
character substitution and internal glyph substition. This all happens
inside a black box and is of no relevance to out present discussion: I have
been arguing on the assumption that the topic is the plain text coding
format and conventions.
> I'm fairly sure from your previous emails that you intend implementations
to
> adopt the second approach. However, as Muhammad implied, this means adding
> significant code to numerous code bodies over which we have no control and
> no knowledge either. On linux alone (never mind the Mac, Solaris, PalmOS,
> BSD, etc.) there are a number of font renderers in current use. Each of
> these is under the control of groups that will not be inclined to make
these
> changes themselves and may not even let others change things (Apple and
also
> the X consortium with their xfs springs to mind). Hence we all (everyone
> except arabic windows users) would be forced into waiting for the
> diachronic page you mentioned, and who knows when that will arrive?
Unicode must be supplemented by adequate font technology. If these entities
do not have it, they cannot handle Unicode.
> > The only possibility to accomplish a robust solution for encoding the
> > Qur'an - or any Classical Arabic for that matter - in the Unicode format
> > would be designing a code set from scratch and apply for it's inclusion
in
> > the second plain as Historic or Diachronic Arabic. This is exactly what
I
> > am working on, including the conversion schemes to upgrade the Arabic
> > industrlal rubbish in Unicode or interchange with it.
> This will be a good step forward, but these characters are not just
historic
> but in widespread current use. Thinking about it, they are in one of the
> most common books in the world!
Diachronic would be the better term: not just going back to the past, but
als looking towards the future.
> Is the Unicode standard ultimately some sort of Platonic ideal that cannot
be
> violated by everyday pragmatism, or is it a tool to help people
communicate
> and share precise information albeit in a non-perfect way?
No. It suffers from the same type of bureaucratic and practical limitations
and obstructions as the platforms you enumated above.
3 - Phased approach
> I concur with Nadim about correcting of the situation in two phases, the
> first which allows all operating systems (not just Windows) to output
these
> extra characters without changes to core services such as font renderers.
OK
> it may in your view be a bit of a kludge, but by your own understanding
the
> whole thing is already a kludge. The second phase can get it _right_ in
the
> ways you are proposing. Even then though I believe that the tajweed
markers
> are characters with genuine meaning and not just glyph combinations.
I agree with the phased approach, but we need to agree on the phases. If
your operating systems cannot handle fonts the way it is assumed by the
Unicode Standard, than I would agree with you that you need to opt for
kludges.
Just count me out.
4 - Industry View
> > As a general rremark I would like to point out that, due to the by
> > definition conservative character (sic!) of Industry Standards, there is
> > no hope in the world of getting a structurally clean solution for
Qur'anic
> > Arabic - or even for Arabic at large. The reason is, that none of the
> > Arabic encoding patterns or font designs were researched by and for
> > scholars and calligraphers, but by employees of engineering companies
with
> > the short term commercial objective of arabizing as cheaply and as fast
as
> > possible whatever product they had that was originally made on the
> > assumption that Latin characters rule the world. It was from the junk
yard
> > of trashed legacy code patterns that Unicode picked its Arabic code.
> I take on board totally what you say about this being an industry body
that
> has purely commercial interests. These characters/glyphs combinations,
> whatever you want to call them, relate only to qur'aan and islamic books
> (of which there are a huge number) which are almost entirely produced in
the
> arab world which does not have a loud voice in these commercial global
> organisations.
The real discussions are not based on loud voices but on competent ones.
Unfortunately the really competent people in the field of Qur'an and Islamic
books have either been too modest to participate in these discussions, or
their hearts are elsewhere.
> Nevertheless it is in their interests to facilitate the production of
these
> books, particularly bearing in mind the way the IT world is changing in
the
> arab world. Copyright law is now being taken much more seriously and the
> time will come when people have to actually pay for products such as
Windows
> and PageMaker. While perhaps they cannot get away without the DTP
software,
> they would like to use Open Source (read: Free) operating systems. If the
OS
> cannot support the output of these characters then it will hobble take up
of
> paid-for software products. If these characters are not added then there
is
> a strong chance that these OSes will never support the tajweed marks for
the
> technical reasons I mentioned above and will not be able to be used for
> preparing a large body of books.
This is of prime importance and should be given the highest possible
priority
By far the majority of the competent people in the industry - most likely
including you yourself - want to be remunerated for their work and skills,
or the want be concerned. I do not yet understand how Open Sources - without
some kind of funding - cannot but push Arabic technology into the farthest
possible margin of the computer globalization drive.
> Although the chances of these changes as you say may be slim, we have a
far
> greater hope with your support. Please consider it.
I sympathize with the idea, in fact I am actively looking for ways to get
this off the ground myself. But without a generous party funding it, I would
just should myself in the foot.
Regards,
t