[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Arabic Unicode fonts



On Thu, 9 Aug 2001 07:44:14 +0100
 David Starner wrote:
> 
> I actually know little of Arabic; I'm just here as a general i18n know-it-all
> and hopefully, to help the Arabization of Unix be done in such a way that
> all the wheels don't have to rebuilt

Let me be the first to welcome you aboard :-)  Your expertise will be of
great benefit to our effort and we hope (and aspire) to return the favor
in the future - feel free to invite others that might be interested in
helping and participating.

> Arabic Presentation Form A and B shouldn't be used in files; use characters
> in the 0600-06FF block and the application should take the responsibility
> for using glyphs from Presentation Forms A & B if neccesary.

Well, it _always_ will be necessary and that's my point (its not even almost
always, its "always" :-).  0600-06FF presents a flavor of the entire Arabic
alphabet (each letter is represented in _a_ particular form - initial, medial,
final and isolated), it also includes all the Arabic numbers and punctuation,
but 0600-06FF, by all means, is not complete since it doesn't include all the
various character permutations (forms).  With that said, let me rephrase what
you've noted (sorry, if I'm being dense); the idea here is to use 0600-06FF
and simply plop characters down (irrespective of form) upon which time the
application (or underlying library) would go about transforming the
characters into their appropriate visual glyph (based on location, etc),
right ?

OK, here are a couple more questions :-)

Why do it this way :-D ?  Are there some hidden advantage that I'm not
thinking of (beside saving font space) ?

It would seem more logical to simply store all those visual ("correct")
glyphs with their appropriate encodings instead of again reverting to the
visual re-mapping every time this file is opened -- and I'm not talking of
saving visual hints and/or control characters.  Here's a scenario -- let's
assume I write a really long/large document in Arabic all the while the
application is doing these conversions as I type (or maybe it post-processes
on a per paragraph basis or whatever) - I then save my document (currently
all that visual conversion would be lost, right ?) and is stored on disk
using only 0600-06FF encodings.  Why not preserve all these conversions so
that if someone wanted to read my 15MB :-) file they wouldn't have to wait
for any more conversions to take place (its a waste of time and processor
throughput) ?  You see what I'm saying ?  With that in mind, I was thinking
that Form-B is an integral part of any unicode "Arabic" font since it needs
to be known (and used) by everyone (well, the converter has to have these
glyph from somewhere, right ?).

> To fully support Unicode, a font format like OpenType is needed. An OpenType
> font can take a characters, like U+062A, realize it's in the medial form,
> and display the appropriate glyph, without needing a Unicode character.

I think I understand the concept of how this was supposed to work - but my
comments/question above still stand.

It just seems odd to go this way - its certainly cleaner to include all the
characters and their various permutations and give the user the ability to
decide what he wants to type and how he wants it to look; ensuring that what he
typed would be saved in exact-mode (what-you-see-is-what-you-store -- WYSIWYS
:-)  Granted that the application would still have to do this conversion (or
shaping), but its only done once -- upon creation.  Moreover, this conversion
library would be universal given universal fonts and encodings (no optional
anything).  If this were to happen, it would give any application, given the
right set of fonts, the ability to display Arabic characters, no ?  The
person would be able to display (or read) a document, but wouldn't be able
to modify it unless he had Bidi support and shaping.

> Under Unix, OpenType is supported by FreeType 2. Since OpenType fonts are
> currently almost impossible to make under Unix, what about BDF fonts? Arabic
> Presentation Forms A & B is made for stuff like BDF fonts, and an argument
> can be made that every Arabic BDF font should include them.

Didn't follow - sorry (I'm new to all this Unicode stuff).  How does OpenType
relate to Unicode (or does it) and you imply that OpenType does conversions
itself (which encodings is it using ? which standard is it adhering to ? it
sounds like a library to me).

> You had another question about how you were going to encode that many glyphs
> in an 8-bit font. UTF-8 is irrelevant here. If you have XFree86 4.0, it
> includes fixed fonts encoded in ISO10646-1, which is the encoding for
> Unicode fonts under X.

What if I'm not using XFree86 - what if I'm on a solaris SUN system and want
to augment an application to support Arabic :-)  Is there an issue regarding
sharing these documents across different hardware platforms (that's my
biggest fear and concern).  I'm somewhat leery of this magic that needs to
happen to transform characters into glyphs every-time a file is opened;
ensuring consistent conversion over different applications and systems seems
nightmarish and error-prone.

> When you use UTF-8 (Unix's normal encoding for Unicode), U+FE70 will be
> encoded as 0xE08080, but it uses the U+FE70 to display under X.

I think there is a thread on "font encoding", so please paste this there :-)

U+FE70 -> 0xE08080 how ?  Why not 0xEF8080 or 0xE08081 ? What are the rules ?

Thanks...

 - Nadim


__________________________________________________
Do You Yahoo!?
Make international calls for as low as $.04/minute with Yahoo! Messenger
http://phonecard.yahoo.com/