[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: mined/mlterm display problems
- To: Nadim Shaikli <shaikli at yahoo dot com>
- Subject: Re: mined/mlterm display problems
- From: thomas at towo dot net
- Date: Tue, 04 Feb 2003 01:06:08 +0100
- Cc: developer at arabeyes dot org
- Cc: mined at towo dot net, thomas at towo dot net
Hello Nadim,
> enter U+0645 (MEEM) then U+064E (FATHA) - the cursor now should stand
> right after the U+0645 (the U+064E is a "composing" character and gets
> folded into whatever proceeded it), now enter U+0646 (NOON) -- see the
> extra space ?
No, this looks perfect here, no extra space. It's on SuSE Linux 8.1
aمَن
----------------------
The LAM/ALEF problem you pointed out is not related to
combining/combined characters which are handled well by mined.
It's rather caused by the fact that mlterm applies LIGATURE
substitution.
I'm actually not sure if a terminal should be supposed to do that,
but maybe it's of such an essential nature to Arabic typesetting
that it's useful this way.
> enter U+0644 (LAM) followed by U+0627 (ALEF) followed by U+0631 (REH) -
The display problem already occurs after entering the ALEF;
the display is different after save and reload but it's wrong then, too
(note that you can move the cursor beyond the end-of-line mark.)
Actually, this is a big problem in general as an application cannot know
if the terminal has this feature or not. There will have to be an
additional explicit parameter to tell it whether to assume this behaviour.
> Its the same flavor problem as was noted. Yes, mlterm does the visual
> combining, what mined needs to do is to track the previous character
> as well as current character and not account for an extra cursor/column
> increment iff certain combinations are present. In pseudo code,
> if ( /* have combined -
> check various combining flavors,
> U+0644 + U+0622
> U+0644 + U+0623
> U+0644 + U+0625
> U+0644 + U+0627
> */
> ((prev_char == LAM) && (current_char == ALEF)) ||
> ...
> /* have composing -
> check various composing flavors (all tanween),
> U+064B -> U+0655
> */
> (current_char == FATHA) ||
> ...
> )
> {
> /* skip cursor/column increment */
> }
> else
> {
> /* do normal cursor increment */
> }
I have two remarks on this:
1. Apparently, the first part cares about ligatures. Only 4 actual
combinations are listed and I could verify with a test file
(generated from Unicode data) that exactly these display wrong with
mined on mlterm.
But the Unicode data list a lot of further ARABIC LIGATURE
characters between U+FBEA and U+FDFB, there is just no glyph in the
font that my mlterm installation uses. If the font is extended,
would mlterm apply ligature substitution here as well? Or is this
planned for the future? This should be clarified in order to assure
reliable behaviour.
2. The second part is about combining characters. As noted above,
I see no problem with mined. But there is a bug with mlterm
(on SuSE Linux 8.1) where no combining characters are displayed at
all, just the base characters are shown.
My test file xar contains the characters:
abcييdefييghi
abcي̊ي̀defيۖيۗghi
(abc U+064A U+030A U+064A U+0300 def U+064A U+06D6 U+064A U+06D7 ghi,
the first line without the accents)
Both lines look identical when I type "cat xar" on mlterm.
I have made a first quick and dirty patch for the LAM/ALEF ligatures.
I've uploaded it to http://towo.net/mined/mined-2000.6.tar.gz in case
you'd like to try it (it's not yet linked from the download page).
Kind regards,
Thomas