[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Quran data and issues in encoding the Quran in unicode



Hi Mete, Meor,

Just a quick reaction: U+0641 TATWEEL does not represent a character (or
rather, grapheme) but a unit of typography (i.e., a glyph). It should never
have been part of the Arabic code block in the first place. If you prefer
the sequence like Fatha-SmalAlifAbove (in regular Unicode) to print with a
upporting Tatweel, consider building a substitution in your OTF.

A second point is the use of tanween followed by SmalMeemAbove and
SmalMeemBelow. This is non-standard use of the small meems, plain and
simple. The fact that the obvious encoding with sequential single harakat
and single harakat+small meem is not supported correctly by Microsofts
Uniscribe does not justify the use of illegal encoding. It would be better
to report a bug to MS typography or, if you don't like to be the prisoner of
third party's prorietary solutions, develop your own OTF parser (what we
do).

BTW, Mete did not mention Decotype's Naskh as a font that handles Qur'anic
Arabic, because it is not yet published. In this project we consider the
Uncode points SmalMeemAbove and SmalMeemBelow a mistake: they are contextual
variants of SmallMeem (were - a single! -kasra pulls it below the script
line). Therefore we treat them as identical. However, for compatibility's
sake we could add a few front end substitutions to convert your private
encoding to our (private?) encoding (which I believe you could have done to
bypass the MS Unicsribe constraints)

Regards,

t

Mete Kural wrote:
> Hello Meor,
>
> Please find my suggested encodings and explanations below.
>
>> About the small alef, personally I would like to encode it using a
>> tatweel + superscipt alef  for medial position, and a space +
>> superscript alef for isolated position. The reason being is that the
>> sequence will work on most, if not all existing font. You might argue
>> that we don't need a tatweel for medial position, but without it, you
>> will encounter another problem under windows. The same goes for small
>> noon and yeh, which i thnk beter encode it with a tatweel. For small
>> waw, I agree with Mr Milo.
>
> First of all, I would suggest to you not to steer the project in a
> way to accomodate the variety of Arabic fonts that are available
> today which do not implement the Unicode Arabic spec adequately. More
> than 90% of the Arabic fonts out there ignore implementing
> considerable sections of the Unicode Arabic specs. Only a handful of
> fonts come close to rendering the Quran correctly (at least rendering
> what can be encoded of the Quran with the current Unicode spec,
> excepting the missing needed Quranic characters). Take notice that I
> say come close to rendering the Quran correctly. I wouldn't be
> surprised if there are only a handful fonts in the world today that
> in fact do render correctly what can be encoded of the Quran with the
> current Unicode spec. In fact the only four that I know are
> Microsoft's Arabic Typesetting, your Arabeyes.org Meor font, and
> SIL's Scheherazade (even these two have problems with small alef I
> think) and Lateef fonts
>
(http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&item_id=ArabicFont
s).
> There may be other solutions that are not yet delivered to the market
> or I haven't heard of. So I think trying to accomodate the encoding
> of the text to render the small alef correctly with other fonts that
> aren't suitable for rendering the Quran is unnecessary. The
> compromise made on the consistency of the encoding is not worth it to
> try to accomodate these unsuitable fonts. You already have a
> challenge to accomodate for the Gnome and Uniscribe rendering
> engines; trying to accomodate for some incomplete fonts in addition
> to that would leave you with a not so desired encoding quality. This
> is why I would recommend you not to use a tatweel before the small
> alef.
>