[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: yudit + bidi



On Thu, 30 May 2002 17:21:14 +0900 (JST),
  "Gaspar Sinai" <gsinai yudit org> wrote:
>
> On Wed, 29 May 2002, Nadim Shaikli wrote:
>
[snip snip]
>
> >  2. Yudit's need for the algorithm to be reversible is there
> >     for sanity-checks or is it integrated into its core somehow ?
> >     In other words, what are its benefits (sans the potential
> >     security concern you note on yudit.org's website).  I'm just
> >     trying to make a case for why people should consider it.
> 
> The sanity checks are not there but they are planned.
> One sort of a sanity check will consist of comparing the view
> with the file without saving:
> 
>  file->decoding->bidi reordering->shaping->combining->view
> 
>  file<-encoding<-bidi ordering<-reshaping<-combining<-view
> 
> In Yudit bidi reordering is the odd-man-out: if I implement
> the Unicode bidi as is, it will not be reversible. It was not
> easy to make other things  reversible either. Every time
> Unicode Standard introduced some algorith it was
> non-reversible. I don't want to give up on it just for
> one single algorithm.

If I way rephrase - Yudit likes to implement reversible algorithms
and Unicode's Bidi isn't reversible so Yudit will not go down that
route unless it _absolutely has to_, right ? :-)

And that is all fair and even-handed.  Yudit is your baby and its
obviously your call.  Now back to this reversibility thing :-)

> If we could make a reverse algorithm, we could also
> auto-test the BiDi program during operation. I don't
> think testing a complicated algorithm like Unicode BiDi can
> be done with just a small set of test-cases. Even a small
> set of test cases reveal that the current implementations
> of conformant BiDi programs all differ.

I don't think anyone questions the wisdom of it being a neat/useful
option to have it, its now more of a question of whether its doable.
And if its not doable, then what...

> I tried to explain the reasons why Yudit is not using any
> algorithms  given by  Unicode and it purely works on the
> view instead of a back-store buffer.
> 
> So how should I proceed?
> 
> I could easily add a flag to each line about the initial
> directionality, and add a mark to each combining cluster
> to help reordering.
> 
> Also the initial directionality somehow  would need to be
> passed when cutting and pasting for smooth operation.
> 
> I don't think we can change Unicode algorithms but there
> must be a way to hack them.

Well, if this is all within Yudit then any solution that you
might think of might work (ie. in passing variables, etc), but
one needs to be keep in mind is that Yudit should play well
with other kids in the playfield :-)  In other words, if I
wanted to cut-n-paste from my Arabic enabled terminal-emulator,
Yudit should accept and display it correctly.  Another thing
to note is, what's in the cut-buffer should not really be
visual characters to begin with - and so this problem resolves
to simply being the initial stored file issue.

> As I mentioned in other mailing lists before, I believe
> reverse algorthms can provide some protection against
> intentional and unintentional back-doors/rendering errors.
> 
> I think nothing is more dangerous than tempering with
> the algorithm that projects you the bits from logical
> buffer to to the screen in non-linear order. And nothing is
> easier to sanity check - just apply the reverse algorithm.

Well, I don't want to get into the reasons, etc.  I'm looking
to "convince" you of the need to include "proper" bidi into
Yudit.  So here's a bit of talk about this reversibility issue.

After many hours, I'm now convinced that bidi is indeed not
reversible as it stands now (all the heuristics I was able to
come up with were not fool-proof -- some worked better than
others, but all were breakable).

The issue really amounts to this.  Assume f() is a transformation
function (bidi in this case); assume A, B to be data points, then

  f(A) = X
  f(B) = Y

given that X and Y are not equal one will be able to reverse,
but there are cases (and its rather easy to generate too) where
A != B which results in X == Y  -- ouch !!!

Whether this is a security breach is debatable and I, again :-),
don't want to go there since there is context involved.

So now what ?  Well, I could think of one solution - include a hidden
visual directives (those hidden directives will NOT show up in the
file and would simply serve to note initial character start per line).
Since this visual hidden char is not part of the unicode standard,
yudit should implement it internally and should not affect any of its
external interfaces (cut-n-paste should skip those hidden chars, they
should not be stored, etc) until such time as the unicode folk accept
such an idea (if ever).  This would serve Yudit's purposes in hacking
reversibility.

In short - bidi is not reversible, a hack could be worked out
which could/would involve a visual hidden character (not transferable
and not to be stored).

For those not following what's being said - here's a statement
of the problem.  The reversibility of Bidi involves taking a visual
representation (ie. what's on screen) and without knowing anything
about how it came about, regenerate the contents of the file on
disk.

Visual example (capitals are Arabic letters),

## Sample file contents
MILK & cookies
fish & CHIPS

I run fribidi; the display output I'll get,

+---------------screen width------------------+
                                 cookies & KLIM
fish & SPIHC
+---------------------------------------------+

if I note fribidi's l2v (logical-to-visual) and
v2l (visual-to-logical) info using fribidi's
"--verbose" flag, I see this,

for the 'cookies' line its -
  l2v: 13 12 11 10 9 8 7 0 1 2 3 4 5 6
  v2l: 7 8 9 10 11 12 13 6 5 4 3 2 1 0

for the 'fish' line its -
  l2v: 0 1 2 3 4 5 6 11 10 9 8 7 
  v2l: 0 1 2 3 4 5 6 11 10 9 8 7 

So the "reversibility" question then becomes "how do I revert
what is shown on screen to what is inside the stored file".
And that is simple if the v2l was available, but in a true
reversible system, that info would not be around.

Simple, right ? Not quite - consider this scenario.

## Sample file contents (note equal spaces on both ends)
             hello there FISH MILK
             FISH MILK hello there

again run fribidi; the display output you'll get,

+---------------screen width------------------+
             hello there KLIM HSIF
             hello there KLIM HSIF
+---------------------------------------------+

:-) and there-in is the problem.  So now with no info on the
file and no v2l data - how do I figure if the line started
with an english word vs. if it started with an Arabic one ?
The only way I could figure is to add a visual hidden character
next to the first character of the line.  So I would end up with

+---------------screen width------------------+
            ~hello there KLIM HSIF
             hello there KLIM HSIF~
+---------------------------------------------+

where the '~' is that hidden character (again ONLY in visual mode).
I don't think this visual hidden char should be used outside of
Yudit unless it becomes part of the unicode Bidi algorithm.  This
guarantees interoperability between Yudit and other applications
while fulfilling Yudit's reversibility requirement.  As for what
happens when someone pastes into Yudit something, I'm guessing that
pasted characters are captured in the order the user moves over
the characters (ie. right-to-left vs. left-to-right) and as such
Yudit could insert the visually hidden char prior to the first paste
if it where on a new line (or else deal with it when it runs Bidi
on the buffer which happens with every change).  Granted its easier
said than done, but I think you get the general idea.

I don't see the need for any other info (like for combining
clusters, etc).  Given the info noted above, you can run a forced
directional bidi on the line to get its correct initial state.

That's my $0.02's worth.

Sorry for the long email - thoughts ?

Regards,

 - Nadim


__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com