[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Arabic font encodings



Along with my adventures into font-land, I've accumulated these
semi-related questions,

 1. Encoding - is there a standard list of how glyphs are encoded for both
    ISO 8859-6 "arabic" fonts and forms-B ?  I've figured that, for whatever
    reason, the encoding can not exceed 2-bytes.  ASCII utilizes 0x20-0x7F
    (that's hex of course).  Leaving us with (0x80-0xFF plus 0x00-0x1F),
    roughly 160 encoding positions to muck with.

    With my rough calculations looking at the code-tables at unicode,

     "Arabic  (0600..06FF)" requires 222 encodings (excluding empties)
     "Forms-B (FE70..FEFE)" requires 140 encodings (excluding empties)

    I think you know where I'm going with this -- so we have about 160
    encodings to encode 362 characters ??  I take it the applications 
    will have to accept encodings that are more than 2-bytes
    (ie. 0x000 - 0xFFF) ?  Or am I missing something ?

    If I'm not hallucinating, let me answer my own questions above and
    note that this is exactly what UTF-8 is for, right ?

    So looking into UTF-8 now in more details,

      http://www.cl.cam.ac.uk/~mgk25/unicode.html

    I see,

            Unicode setting      |    UTF-8 encoding
      ---------------------------+-------------------------------    
       a. U-00000000 - U-0000007F: 0xxxxxxx
       b. U-00000080 - U-000007FF: 110xxxxx 10xxxxxx
       c. U-00000800 - U-0000FFFF: 1110xxxx 10xxxxxx 10xxxxxx

    Which means like for (I'm just taking examples here), I can encode
    glygh 0xFE70 (from Forms-B) to be say 0xE08080 (using line 'c' above).

    So my main question is whether there is a standard that specifies what
    these encodings are (what are the rules to how one encodes) - if
    everyone were to take an arbitrary guess at what he/she thinks are good
    encodings we'd endup with chaos (files won't be able to be shared) ??
    I just haven't been able to find where the rules are spelled-out..

 2. Windows, of course, uses a different code-table (or codepage).  That
    codepage is better known as "CP-1256".  I've had a terrible time finding
    CP-1256 fonts and encodings - could someone shed some light on this
    for me in terms of links/doc/whatever...  I don't need links to micro$oft;
    I need links to where I can actually download bdf (since they are ascii)
    fonts (pcf would be OK as well since I can convert them to bdf).

Again, keep in mind that I'm after what's STANDARD (if such a thing exists)
in terms of Arabic encodings maps.

Thanks..

 - Nadim

__________________________________________________
Do You Yahoo!?
Make international calls for as low as $.04/minute with Yahoo! Messenger
http://phonecard.yahoo.com/