[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: yudit + bidi



On Wed, 29 May 2002, Nadim Shaikli wrote:
[...]
>  1. could you please add a '-noshape' option to uniprint to disable
>     its current shaping (shaping without Bidi does more harm than good).
>     With the '-noshape' option we can use uniprint through the pipe'ing
>     method to get reasonable results in processing arabic.

I would prefer making a conformant bidi in Yudit (of course without
sacrificing reversible algorithms). Of course as a hack it would be
possible to add such and option... As an interim solution you just
need to remove <prefix>/share/yudit/data/shape.mys.

>  2. Yudit's need for the algorithm to be reversible is there
>     for sanity-checks or is it integrated into its core somehow ?
>     In other words, what are its benefits (sans the potential
>     security concern you note on yudit.org's website).  I'm just
>     trying to make a case for why people should consider it.

The sanity checks are not there but they are planned.
One sort of a sanity check will consist of comparing the view
with the file without saving:

 file->decoding->bidi reordering->shaping->combining->view

 file<-encoding<-bidi ordering<-reshaping<-combining<-view

In Yudit bidi reordering is the odd-man-out: if I implement
the Unicode bidi as is, it will not be reversible. It was not
easy to make other things  reversible either. Every time
Unicode Standard introduced some algorith it was
non-reversible. I don't want to give up on it just for
one single algorithm.

If we could make a reverse algorithm, we could also
auto-test the BiDi program during operation. I don't
think testing a complicated algorithm like Unicode BiDi can
be done with just a small set of test-cases. Even a small
set of test cases reveal that the current implementations
of conformant BiDi programs all differ.

I tried to explain the reasons why Yudit is not using any
algorithms  given by  Unicode and it purely works on the
view instead of a back-store buffer.

So how should I proceed?

I could easily add a flag to each line about the initial
directionality, and add a mark to each combining cluster
to help reordering.

Also the initial directionality somehow  would need to be
passed when cutting and pasting for smooth operation.

I don't think we can change Unicode algorithms but there
must be a way to hack them.

As I mentioned in other mailing lists before, I believe
reverse algorthms can provide some protection against
intentional and unintentional back-doors/rendering errors.

I think nothing is more dangerous than tempering with
the algorithm that projects you the bits from logical
buffer to to the screen in non-linear order. And nothing is
easier to sanity check - just apply the reverse algorithm.

Happy hacking :)
Gaspar