[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: glyphs, characters and fonts
- To: general at arabeyes dot org
- Subject: Re: glyphs, characters and fonts
- From: "Chahine M. Hamila" <mch at chaham dot com>
- Date: Wed, 09 Jan 2002 18:35:09 +0100
Salaam,
A character encoding is the way a character is internally encoded,
ISO-8859-1, ISO-8859-6, CP-1256, Unicode encodings etc...
For example, let's suppose we are using an encoding called AE where s is
in position 50, l in position 55 and m in position 60.
The string slm (salaam) would be
char *str = [50, 55, 60,0]; // The 0 being the mark of end of string in
C
i.e. str[0]==50, str[1]==55, etc..
The glyph is the drawing that appears on a screen.
For example, if we decide we associate the drawing "s" to the position
50 in the encoding AE, "l" to the position 55 and "m" to the position
60, the string slm would appear on the screen as "slm".
But if we decide we associate the drawing "a" to the position 50 in the
encoding AE, "c" to the position 55 and "e" to the position 60, the
string slm would appear on the screen as "ace" BUT WOULD STILL MEAN slm
TO THE COMPUTER!!!
A font file is in general a file where these drawings are stored at
given positions.
Where it's easy to make the match between the encoding and the drawing's
position (Latin, Hebrew, Russian, etc...) the drawings, better known as
the glyphs, are stored at the same location their encoding indicate,
thus making the mapping trivial. For example, the latin glyph "A" will
be stored at position 65 in a font file, like the ISO-8859-1 or most
other encodings locate the letter A.
In Arabic (except if you chose to use my experimental position agnostic
glyphs "Square Arabic" in Akka;)), we have shaping, which means, the
same encoding will not necessarily correspond to the same glyph and we
have to go through a more complex processing. For example, we could
decide that AE's s, i.e. the char with the code 50 would be associated
with the first position glyph stored at the element 10 in the glyph
file, mid position in position 11, end position in element 12, and lone
position in element 13, and the soft would chose what is the appropriate
glyph to display according to the context. For that reason, it is not
important where glyphs are stored in a file as long as the soft knows
where they are, but it is important to use standard encodings because
files, libraries for string manipulations and other data treatment
pieces of codes rely on the encoding.
We might wonder why we don't simply make an Arabic encoding using
directly a trivial code-to-glyph encoding (or use the trivial upper
Forms maps in unicode's tables). Because that would mean making the
problem harder, moving it in the layer of every data manipulation where
the same letters would be considered different according to their
position or context, and thus making a trivial comparison of mid-word s
and end word s a complex work for example.
Hope that makes it clearer.
Salaam,
Chahine
"Haisam K. Ido" a *crit :
> Hi Guys:
>
> Is there a document which describes the meanings and relationships
> between glyphs, characters and fonts?