[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unicode Bidi Proposal

To: general at arabeyes dot org, Tzafrir Cohen <tzafrir at technion dot ac dot il>
Subject: Re: Unicode Bidi Proposal
From: Chahine Hamila <mch at chaham dot com>
Date: Thu, 02 Aug 2001 18:46:46 +0200

Tzafrir Cohen wrote:

> mirroring will give you an approximation of the bidi algorithm.

More or less. The point is to make them coincide for "simple texts" (as
defined in the proposal), which make the overwhelming majority of texts.

> A better
> approximation will be to try to do with the linux console something
> similar to the existing bidi patch of xterm (which works rather fine for
> hebrew).

I don't know the bidi patch to xterm. I know that terms are certainly NOT
okay for bidi treatment for a very good reason: I have been working on
acon/Akka for a while, I have experienced quite a few things with it, bidi
implemententions included, even using fribidi at some time. Unicode
recommendation is based on a word treatment, not a character treatment, and
any device/viewport/display/etc... that is less than a word processor capable
won't work. There too, line breaks are a real problem, make texts
untransferable and terminal width dependent. The best one can do with a
terminal is what I suggested in the revised proposal (the one that keeps in
mind your critics), really. Granted it's not perfect, but you can't do better
(want a mathematical proof?;) if the answer is yes, you'll have to w8 coz I'm
rather busy now but think of a word that comes on a screen boundary and
things like that).

> There is no need to change _all_ the software: just one basic
> level of libraries has to be adjusted. The good things about those
> adjusments is that their overhead is quite negligable in the case of a
> all-LTR text, and thus they can easily be added to the core library.

Well, with the approach I suggest, there isn't any change in any library
whatsoever. This is why I think it's important, all it takes is a driver that
mirrors a display and implements an automatic "BC" mode. It's pretty trivial
and almost done in Akka. Then, you are able to use *every* existing English
sofware for "simple texts". For more complex texts, i.e. those which contain
at least two long (>=2 words) sentences in both directions, Unicode-like
treatment *is* necessary, and you have to use a high level software able to
deal with it. The problem is, such texts are comparatively much lesser (<<)
than "simple texts", and we would therefore 1) save much time by making all
existing English directly usable 2) benefit from all existing English
developed soft that doesn't contain bidi capabilities.
The only drawback is that the *current* Unicode spec is not compatible with
this approach, whereas a very small change would make it compatible without
removing any of its complex capabilities, and without adding any overhead or
complexity in either implementation or use. IOW, you can use all existing
English soft for most Arabic texts without waiting, and leave the little
(mixed text) you can't for more complex application layers when they are
developed without flushing all previously written text down the toilets *if*
the small change in the unicode spec is done.

> Keep in mind to avoid creating visuallly-laid documents, because there is
> no such thing as a document that is only read, and processing visual text
> requires a change of the software.

I think there's a misunderstanding here. What I am suggesting now is "visual"
in the sense that English text is stored in "visual" order because it is to
be displayed on a left to right display. If English was to be displayed on a
right to left display, your point about line breaks would be valid for
English too. When I am advocating storage of Arabic text in "visual" order,
it is with the assumption that it will be displayed on a right to left
display (hence, the only requirement is the inversion of the X coordinate).
I hope this clears things enough to make you able to grok the revised
proposal. If some points are unclear, point them out, and I'll detail them
coz it's important.

<3B695F98 dot 5D8F3482 at chaham dot com>

References:
- Re: Unicode Bidi Proposal
  - From: Chahine Hamila

Prev by Date: Handasa/Arabization Workshop
Next by Date: Revised Bidi proposal
Next by thread: position-independent glyphs [was Re: arabic-linux]
Index(es):
- Date
- Thread