[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Proposal for the Basis of a Codepoint Extension to Unicode for theEncoding of the Quranic Manuscripts
- To: General Arabization Discussion <general at arabeyes dot org>
- Subject: Re: Proposal for the Basis of a Codepoint Extension to Unicode for theEncoding of the Quranic Manuscripts
- From: Abdulhaq Lynch <al-arabeyes at alinsyria dot fsnet dot co dot uk>
- Date: Tue, 21 Jun 2005 10:35:06 +0100
- User-agent: KMail/1.8.1
> Although I think we should first battle the difficulties with encoding the
> Quran in the current Unicode Arabic block as it stands and once we have a
> solution for that designing a grammar-aware Arabic encoding model for
> representing the Quran could be a possible next project. Besides this
> grammar-aware encoding model there is also a need to have another layered
> Arabic encoding model for those who want to do manuscript research. Tom has
> some ideas about this.
>
Hi Mete and thanks for the feedback,
I don't see why we should battle with an encoding that was invented when there
was no clear seperation between semantic characters and glyphs, and by
someone who didn't even under those circumstances think the whole thing
through. I also suspect that they did not understand the science of tajweed
but simply had a look at a couple of mashafs and made certain incorrect
assumptions about the glyphs they saw.
Adding some new codepoints has the great benefit of totally seperating tajweed
marks (which are nothing to do with grammar by the way) from the actual text,
making searching trivial. It allows the rendering application to apply
whatever local rules apply for that rule ( a meem here, a circle there, two
staggered or horizontal fathas etc).
The current ideas I am hearing here are inventing new code points through the
back door by tacking together two glyphs. This has the very obvious problem
of then being unable to distinguish between when we are meant to have two
sequential glyphs or meant to have the new codepoint. It's an ugly hack.
I agree that Unicode won't do this but I say now that as far as an
open-licensed quran document is concerned, we should create our own private
user area and Unicode can catch up if and when they feel inclined.
I agree about the XML too, in fact that was my first thought, but the other
great benefit of the new codepoints is that the text stream can be passed
directly to an OpenType renderer without processing XML.
We are currently in the process of getting an open-licensed document created
and then verified. Let's get that done properly, and starting from now. The
we won't have to do it again a few years down the line.
wassalaam
abdulhaq
> An alternative is to use XML to encode high-level grammatical constructs
> rather than a character encoding scheme. I think this will probably better.
> Since XML is getting really popular it is getting progressively easier to
> work with XML. The Ziyaada, Iqlaab, and other grammatical situations you
> mention could be encoded with XML elements that sorround letters for which
> these situations apply to. I know that would look really cryptic in plain
> text but an XML aware WYSIWYG editor that is designed for editing this kind
> of encoded Quran XML data could be used to conveniently do the encoding.
> But anyways, as I say I think we should now first focus on battling the
> Unicode Arabic block issues, but your proposal for a grammar-aware encoding
> scheme is worthy.
>
> Regards,
> Mete
>
> ---------- Original Message ----------------------------------
> From: Abdulhaq Lynch <al-arabeyes at alinsyria dot fsnet dot co dot uk>
> Reply-To: General Arabization Discussion <general at arabeyes dot org>
> Date: Tue, 21 Jun 2005 09:52:47 +0100
>
> >Further to the list of new code points I would add:
> >
> >Superfluous Letters الزيادة
> >This codepoint, called Ziyaada, would be represented in the font as a
> > glyph that is not intended to be rendered such as a rectangle with the
> > word ziyaada in it.
> >It indicates that the letter is superfluous and should not be enunciated.
> >In the saudi mushaf this is rendered as a circle الصّفر
> > المستدير. In the South African mushaf it is not rendered.
> >
> >Signs of Stopping علامات الوقف
> >It seems to me that the current codepoints are sufficient for now but I
> > think that this scheme would be more future-proof and applicable to local
> > variations in rendering if we were to add codepoints for these too.
> >
> >wassalaam
> >abdulhaq
> >_______________________________________________
> >General mailing list
> >General at arabeyes dot org
> >http://lists.arabeyes.org/mailman/listinfo/general
>
> --
> Mete Kural
> Touchtone Corporation
> 714-755-2810
> --