[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: mined/mlterm ligature joining
- To: Thomas Wolff <mined at towo dot net>
- Subject: Re: mined/mlterm ligature joining
- From: Nadim Shaikli <shaikli at yahoo dot com>
- Date: Mon, 10 Feb 2003 16:05:58 -0800 (PST)
- Cc: developer at arabeyes dot org
--- Thomas Wolff <mined towo net> wrote:
> Hello Nadim,
>
> I am planning to release the next version of mined soon and I'd
> like to assure that the ligature problems will be solved then.
Sorry for the delay - been tangled up in things.
> I'd appreciate your comments on these two issues we had. The
> second is a general question about terminal handling of Arabic
> characters, not specific to mined.
>
>
> 1.
> You wrote:
> > enter U+0645 (MEEM) then U+064E (FATHA) - the cursor now should stand
> > right after the U+0645 (the U+064E is a "composing" character and gets
> > folded into whatever proceeded it), now enter U+0646 (NOON) -- see the
> > extra space ?
> I wrote:
> > No, this looks perfect here, no extra space. It's on SuSE Linux 8.1
>
> Can you confirm that this text is handled well now?
I still see it on solaris. Could someone else try this and report
back with findings ? Download mined-2000.6 from,
http://towo.net/mined/mined-2000.6.tar.gz
compile and start with ``mined -UpoX arabic_utf8_file''. You should
now see arabic without a problem, now switch keyboard (Alt-K - note
capital k) and then select "Arabic" via arrow keys. Now enter arabic
and try out combining characters - is there an extra space ?
FWIW: mined-2000.6 behaves better. The extra space (denoted by
a paragraph symbol) is still there behind (to the left of)
joining/combining characters (LAM+ALEF), but its gone once one
presses on the RETURN key. If you simply move the cursor (up/down,
left/right) the space stays on - I wish I had the time to dive
into the code, but I'm swamped. Looks like you recalculate the
line upon a carriage return instead of after the character that
follows combining character. Thomas, I can send you a screenshot
if you like. NOTE again: this only happens when you enter characters,
if you open a stored file, everything is displayed properly.
> 2.
> mlterm applies automatic ligature joining to the LAM/ALEF combinations
> you mentioned. These are only 4 actual letter pairs, in isolated and
> final form each, so resulting 8 character combinations that need
> special handling.
> I could confirm with a test file that I generated from Unicode data
> that exactly these 8 are joined into the ligature form by mlterm.
Yeah. BTW: this combining is so essential to Arabic that a terminal
emulator (like mlterm) should and must do it - I think you were
wondering about that previously.
> There are, however, a number of other Unicode characters called
> ARABIC LIGATURE, listed with according base character pairs (or more
> than 2 base characters in some cases). The ligature glyphs are not
> contained in the font I have installed, so I'm not sure if mlterm
> would join them if the glyph was present.
> Please tell me about the supposed behaviour. Are all these to be
> automatically joined if a terminal supports ligature joining?
> I append an excerpt from Unicode data about the characters in
> question just to make sure it is clear what I am speaking about.
I think you're looking at unicode's Presentation Form-A and you
shouldn't (www.unicode.org/charts). Although its noted as
"ARABIC Presentation Forms A", that code chart is NOT needed.
The only code charts of concern with regard to Arabic are,
1. Arabic -- U+0600 - U+06FF
2. Arabic Presentation Forms-B -- U+FE70 - U+FEFF
Hope that helps.
- Nadim
__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com