[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: BIDI tags Unicode values

Hash: SHA1

On Fri, 30 Aug 2002 19:09:47 -0500
Mohammed Elzubeir 

> I simply wanted to claify my undersanding (and everyone else's ;)
I think its clear enough now

> The correct way is to use a bidi implementation (like fribidi).. if
> using Pango (if I'm not mistaken), the relevant parts of fribidi are
> incoporated into it.  

I did not study the GTK and pango API carefully so I'm not sure how to
solve this, you are the experts here.

> What you are talking about here is what UAX#9 calls "Implicit
> Directional Marks"(RLM, LRM). To quote UAX#9,
>   "There is no special mention of the implicit directional marks in
>   the following
>    algorithm. That is because their effect on bidirectional ordering
>    is exactly the same as a corresponding strong directional
>    character; the only difference is that they do not appear in the
>    display."

and strong directional characters means??
does it mean that the editor should guess the direction according to
the first character or the dominant language in the paragraph??
this makes sense I suppose.

however it seems the pango API supports BIDI control characters, just
try to right click in the katoob textbox.

> In other words, if you are using a bidi algorithm, they serve no
> purpose. This would be fine most of the time, but some editors
> _will_ display those supposedly non-printable characters. For
> example, VIM will show their codes (of course that is because
> Nadim's patch never acknowledges their existence). But that is
> something you would expect from many other editors. Another example
> is that 'gettext' chokes on it, and flags the PO file with syntax
> errors. That is why we stopped translators from using Yudit
> (pre-KBabel days).
> So, for the sake of transferrability, it is not a desirable feature.

well the solution then is to have another feature to save without
these sometimes problematic characters, and katoob has just the
feature, you can choose to save in UTF-8 encoding (which will include
the control characters) or save in text only UTF-8 which will not save
the control characters.
I think this is a reasonable solution for the problem.

what remains is to decide if we need to alter the katoob interface to
make the characters easily accessible and editable or if they are to
be considered harmful and thus are best left hidden where only those
who really need them can find them.

to me being able to choose the direction of the paragraph and the
order of words regardless of the direction of display is a fundemental
feature, I still don't see how it is not needed, maybe I'm missing

> And it probably shouldn't be (except for special cases). I would get
> familiar with the UAX#9 (http://www.unicode.org/reports/tr9/) before
> getting any more involved in this ;) 
will do

> > anyway the purpose of my post was to encourage feedback so keep it
> > coming.
> Careful what you wish for ;) 

its not a problem to me, its mohamed we should worry about.

sorry about the SPAM thing.

- -- 
Perilous to all of us are the devices of an art deeper than we
                -- Gandalf the Grey [J.R.R. Tolkien, "Lord of the
Version: GnuPG v1.0.6 (GNU/Linux)