[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [putty]Re: PuTTY Bidi - final points



Nadim Shaikli <shaikli at yahoo dot com> writes:

> --- Owen Dunn <owend at chiark dot greenend dot org dot uk> wrote:

> > This isn't surprising at all.  You're replacing two single-width
> > glyphs in positions n and n+1 with just one single-width glyph in
> > position n, but as far as PuTTY's internals are concerned the next
> > character is still at position n+2.  You need to work out a way of
> > dealing with this, moving the following word so that it starts to
> > display at n+1 even though the terminal emulator should still believe
> > it starts at n+2.
> > 
> > It would be good if you then preserved the word's visual length by
> > sticking some tatweel in to make up the space there.  (Note for
> > non-Arabic-speaking folk: tatweel is a meaningless mark that can be
> > used to stretch Arabic words.)
> 
> Adding tatweel (or anything for that matter that is not part of the
> document one is viewing) is not an option.  What needs to happen is
> the visual buffer needs to be adjusted and shifted by one to account
> for this "absorption" or "combining" - the question was really how
> best to do this from an implementation point of view.

Wouldn't just removing the space break the alignment of the rest of
the line?  If you're viewing tabular data, for example, this would be
very important.  Under your model I have to introduce extra padding in
the actual data (e.g. a file) to make things line up properly.

> NOTE: this is how many other application (including mlterm) deal with
>       it and it is the prim-n-proper method.  Adding our own "filler"
>       characters is a recipe for disaster and file non-integrity.

Why would it have any effect at all on file integrity?  This would be
purely a visual thing in the terminal emulator front end.

However, if mlterm does it that way -- do other terminal emulators do
it that way too? -- we might have to do it that way ourselves.  Does
lam-alif occupy a single character cell in mlterm?  Does that mean
that in an 80-column terminal window I can have eighty lam-alif
characters on one line?

> > Yes.  PuTTY doesn't currently support Unicode combining characters.
> > See: 
> >
> http://www.chiark.greenend.org.uk/~sgtatham/putty/wishlist/unicode-combining.html
> 
> Ouch !!  I would guess we can simply ask for that "wishlist" item to
> be elevated from a medium priority to maximum (unless someone is
> willing to shed some light on what is needed to make it happen).

You can ask, but I can't guarantee any of us can make it happen soon.

> A quick note, from what I've seen other applications do, the number
> of composing characters that most allow is 2.  I believe that is the
> maximum number that all languages use/require (so I'm unsure of the
> statement I read on the link above that notes, "PuTTY should support
> an arbitrary sequence of diacritics in any character cell").

PuTTY is not other terminal emulators.  Quite a lot of the time we
value `working properly' much more highly than `just about working, as
far as anyone will be likely to notice, probably'.

Two diacritical marks is not a maximum even in just the languages I
know.  For example, in the Qur'an, Arabic can use a base letter,
shadda, a vowel, and a Qur'anic annotation mark.  Greek can require a
base letter and three diacritical marks.

> The topic of Bidi still looms large overhead.  Which code should we
> use ?

I think ICU's licence is probably OK , but you will need to get an OK
from Simon for that (and whether we're happy to embed that in PuTTY).
The licence looks like MIT:

http://oss.software.ibm.com/cvs/icu/~checkout~/icu/license.html

(S)