[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: my Bidi implementation




On Sat, 13 Mar 2004, ahmad khalifa wrote:

> just for now, as i dont want the source file to get big, and havent
> made my mind on how to load character types, i guess there are
> 2 ways...
> 1- Fribidi's way, which i think is parse the unicodedata.txt file on
>    initialization.
> 2- a static array holding everything from 0x0 to 0xFFFF, which is
>    a total of 10000 values, each is a 2 byte value. and that comes
>    to 65536*2 = 131072 bytes...

FriBidi indeed uses the second way.  But compresses the table.
It supports the full Unicode 4+ dataset (0x110000 entries) by
only consuming as low as 2kb, when optimized for space.
Optimized for speed, it compress them into 20kb.

u mean FriBidi does not load the data on initialization? its hardcoded in the source..? didnt know that... i know very little about compression, but 128KB of memory does not seem like much on todays PCs... maybe i'd look into this sometime... so what compression algorithm does FriBidi use..? or which do u suggest for this implementation ?

> any other suggestions..? im leaning towards the second method,
> as the reason to implement this anyway is so it can easily be
> plugged into PuTTY, without any external files needed....
>
> >that, well, the code pretty simply implements the bidi algorithm,
> >ignoring the details and hard parts ;).  But it's a good starting
> >point if you try the tests on it.
> >About tests, there are a bunch of them in the FriBidi source
> >code, with expected output, but aware that the CapRTL charset is
> >an invented character set solely for testing purposes...
>
> the minor details, i should hopefully get to those soon...
> if you noticed any, please point me to them..

There are rules you have not implemented yet.

if you're refering to the first ones, P1, P2, these are not needed in PuTTY as u said below...

> hard parts like what..?

Well, I can only tell you just by quoting the fribidi source ;).
There is a bound on the maximum level one can get, which is 61,
and handling that thing is rather tricky.

PuTTY is a terminal emulator, which mainly uses an 80*24 screen, so lines wont grow past 80 chars (that can be changed on resize) but it would be close to impossible for a line to have more than 61 embedding levels in a 80 char line...

Other than that, not-yet-implemented parts, and simple bugs you
may have, I can't see any other big flaws, as you have simply
implemented the algorithm with any optimization.

do u mean without any optimization? as thats the case...


Well there's this issue of you not recognizing lines from
paragraphs, (fribidi does not do that at the moment too), but for
PuTTY that does not really make much sense.

correct...


ak.

_________________________________________________________________
The new MSN 8: smart spam protection and 2 months FREE* http://join.msn.com/?page=features/junkmail