[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Arabic font encodings



Salam Nadim,

Nadim Shaikli wrote:

> Along with my adventures into font-land, I've accumulated these
> semi-related questions,
>
>  1. Encoding - is there a standard list of how glyphs are encoded for both
>     ISO 8859-6 "arabic" fonts and forms-B ?

I think you should find the map file on the unicode site.
ftp://ftp.unicode.org/Public/MAPPINGS/

>  I've figured that, for whatever
>     reason, the encoding can not exceed 2-bytes.

It depends on which encoding you are considering.

>  ASCII utilizes 0x20-0x7F
>     (that's hex of course).  Leaving us with (0x80-0xFF plus 0x00-0x1F),
>     roughly 160 encoding positions to muck with.
>
>     With my rough calculations looking at the code-tables at unicode,
>
>      "Arabic  (0600..06FF)" requires 222 encodings (excluding empties)
>      "Forms-B (FE70..FEFE)" requires 140 encodings (excluding empties)

<snip>
You don't need forms-b to be stored. See related email.

>        If I'm not hallucinating, let me answer my own questions above and
>     note that this is exactly what UTF-8 is for, right ?

UTF-8 stores unicode letters using an 8 byte sequences encoding.
It's okay for file storage and data exchange, but it sucks big time for internal
use, as it multiplies the complexity of too many string algorithms by n. UCS
encodings are much better in the latter case.

>  2. Windows, of course, uses a different code-table (or codepage).  That
>     codepage is better known as "CP-1256".  I've had a terrible time finding
>     CP-1256 fonts and encodings - could someone shed some light on this
>     for me in terms of links/doc/whatever...  I don't need links to micro$oft;
>     I need links to where I can actually download bdf (since they are ascii)
>     fonts (pcf would be OK as well since I can convert them to bdf).

You mean CP-1256 fonts usable for linux? I have no idea if that exists.