[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: BIDI tags Unicode values



On Fri, Aug 30, 2002 at 11:41:18PM +0300, Alaa The Great wrote:

> I know it isn't exactly a new feature but the possibilities it adds
> are worth thinking about.
>  

<nod>

> yes you are correct, in fact it is not yet possible to edit it in
> katoob, it is however a common way of making Arabic webpages (I just
> discovered that we are required to use it at my college's student
> projects for instance) so its a good idea to have it.
> so the sensible thing to do is always keep a normally encoded version
> and export the HTML style one when needed.

I simply wanted to claify my undersanding (and everyone else's ;)

> > Oh no! I implore you not to go that route. This is how Yudit does
> > bidi support, which is the most primitive type of support possible.
> > This is discouraged by the Unicode folks.. and by Arabeyes I might
> > add ;)
> 
> I didn't know this, it was my request really, why is this a considered
> a bad idea??
> I mean what is the proper solution for a document that has both RTL
> and LTR paragraphs??

The correct way is to use a bidi implementation (like fribidi).. if using
Pango (if I'm not mistaken), the relevant parts of fribidi are incoporated
into it.  

What you are talking about here is what UAX#9 calls "Implicit Directional Marks"
(RLM, LRM). To quote UAX#9,

  "There is no special mention of the implicit directional marks in the following
   algorithm. That is because their effect on bidirectional ordering is exactly
   the same as a corresponding strong directional character; the only difference
   is that they do not appear in the display."

In other words, if you are using a bidi algorithm, they serve no purpose.
This would be fine most of the time, but some editors _will_ display those
supposedly non-printable characters. For example, VIM will show their codes (of
course that is because Nadim's patch never acknowledges their existence). But
that is something you would expect from many other editors. Another example is
that 'gettext' chokes on it, and flags the PO file with syntax errors. That is
why we stopped translators from using Yudit (pre-KBabel days).

So, for the sake of transferrability, it is not a desirable feature.

> 
> so far katoob can work with only one direction at a time and the
> direction is not hard coded at all in the document.
> 

And it probably shouldn't be (except for special cases). I would get familiar
with the UAX#9 (http://www.unicode.org/reports/tr9/) before getting any more
involved in this ;) 

> anyway the purpose of my post was to encourage feedback so keep it
> coming.
> 

Careful what you wish for ;) 

later
-- 
-------------------------------------------------------
| Mohammed Elzubeir    | Visit us at:                 |
|                      |  http://www.arabeyes.org/    |
| Arabeyes Project     | Homepage:                    |
| Unix the 'right' way |  http://fakkir.net/~elzubeir/|
-------------------------------------------------------
---
Was I helpful? Let others know:
http://svcs.affero.net/rm.php?r=elzubeir