[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: PuTTY & arabic



On Tue, 4 Jun 2002 09:16:25 -0500,
  Simon Tatham <anakin pobox com> wrote:
> 
> > On Mon, Jun 03, 2002 at 11:17:01PM -0700, Kamal Dalal wrote:
> >>
> >> I am looking into patching PuTTY, and make it shape arabic letters
> >> correctly. From what I see (using a Win2k Pro), a simple command
> >> like "cat <utf-8 file>" displays all disconnected arabic letters.
> >> I heard from Mohammad Elzubair that he has seen PuTTY display
> >> correct shapes after the window refreshes itself. I could not see
> >> that behaviour. Can anybody on the list confirm or deny ?
> 
> I think I can explain this phenomenon.
> 
> PuTTY does not know anything about right-to-left scripts. So if you
> send a sequence of characters ABCDE, it places them in its model of
> the terminal screen in the order ABCDE. If you sent them one by one,
> then its display code will be called in five separate calls to
> output first an A, then a B, then a C, a D and an E. So you will see
> the characters in that order on the screen as well.
> 
> However, if you subsequently hide the PuTTY window and then make it
> visible all at once, the display code will be called with the whole
> string in one go - `ABCDE' - and at that point Windows's own Unicode
> handling will notice that those characters are in a right-to-left
> script and it will helpfully reverse them for you - so you will see
> EDCBA at this point. However, if you then move the PuTTY cursor over
> those letters they will be redrawn one by one in the original order
> again, since PuTTY still _thinks_ they're on the screen in the order
> `ABCDE'.
> 
> Clearly PuTTY needs to do something about this. At a minimum it must
> become able to predict what the Windows text display call will do
> with the characters it sends, and hence not have this discrepancy
> between what it thinks it displayed and what it actually displayed.

I don't know much about PuTTY's internals, but I could share what
I've seen a number of other applications do and how they deal with
this problem and its rather straight forward.  Instead of a single
character array per line (and that being the actual logical view -
in your example 'ABCDE'), have two arrays.  One that is logical
(ie. as it currently is) and one that gets molded and transformed
into another (ie. 'EDCBA') with a mapping from one to the other (or
simply capture windows' unicode processor output).  This second array
would be the one that gets displayed.  For further details, check out
fribidi.sf.net (for mapping examples, etc).  Using this method
guarantees that when a cursor moves over a visual character (as part
of the second array) its status will not need to revert to any prior
state.

> Beyond that, the question of what to do about sending text in the
> order ABCDE and having it entered into PuTTY's terminal handler in
> the order EDCBA is a harder one to solve. There are of course
> published algorithms and implementations of code which will take a
> piece of mixed-language text and arrange it correctly, but I think
> it will be somewhat harder to implement this in a terminal emulator
> than in (say) a word processor, because it must be done _character
> by character_. You don't just have to work out what happens if you
> have English and Arabic text side by side on the same line of text;
> you need to be able to _send_ that text to the terminal, one
> character at a time, and know what happens to the screen as you send
> each character - in a way that's implementable algorithmically and
> simply, with a minimum of stored state.
> 
> I'm unaware of any existing specifications for how this should work.
> As far as I know, the recent UTF-8 modifications to xterm don't even
> attempt to solve this problem, but merely print characters left to
> right in the order they're given - so that it's the job of `cat' or
> an equivalent utility to do the bidi calculations and re-order the
> characters before sending them to the terminal. It isn't clear to me
> whether this is how things need to stay - some input from people who
> actually use these characters would be welcome.

You might want to look into mlterm.sf.net (a fully-enabled Arabic
supporting terminal emulator -- with shaping and Bidi).

Hope that helps.

 - Nadim


__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com