[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Fwd: Arabic patch: Todo list?]



Nadim Shaikli <shaikli at yahoo dot com> wrote:
> --- Steve Hall <digitect mindspring com> wrote:
[...]
> > -------- Original Message --------
> > Subject: Arabic patch: Todo list?
> > Date: Mon, 27 Jan 2003 02:57:10 +0100
> > From: Antoine J. Mechelynck <antoine mechelynck belgacom net>
[...]
> > Just a few remarks:
> >      - The first day I had several crashes (protection violations, I
> > think; CrashGuard intercepted them). Maybe I had been trying to use too
> > big a font. Now I set 'gfn' height it to "only" 20, I save my work
> > frequently, and the problem seems to have disappeared
>
> Don't have any idea on what's happening here - could he narrow it down to
> the failure and note the exact steps (I can't reproduce this).

I'll try -- at some future time. It seems to happen when doing :set
rightleft!
>
> >      - In vocalised text, some combining characters (e.g. sukuun IIRC)
> > cause a break in the script (the consonant at that point assumes its
> > "ending" presentation shape and the next one [if any] its "beginning"
> > one). (This is a minor detail: the text is correctly generated in
> > memory and on the disk, and it is still better than what NS7 does: last
> > I tried, NS7 placed the diacritics between constonants instead of over
> > them, producing a very ugly "chopped" script.)

this seem to be erratic -- I'm not sure what makes the script flow break or
not break

> > Also (and this is common
> > with NS7, IE6 and WordPad) there must be no combining characters on the
> > laam of a laam-alif ligature, or it won't be displayued right.
> > (Temporary solution: omit them; or maybe in the case of long "a" place
> > the fat'ha on the alif of prolongation instead of on the laam. The
> > result displays more or less aceeptably but is not the "received" way
> > to do it AFAIK.)
>
> I'm not sure I get the point - but if I read it correctly the user is
> noting that using harakat (sukuun, fatha, etc) on combined characters
> doesn't show properly ?  If so, then that's ok since the combined
> characters (Lam+Alef) usually aren't supposed to be followed by harakat
> characters.

The point is: in fully vocalised text (be it the Qur'an or a beginner's
reader), laa ("not") should have fat'ha on the laam, al-ard ("the earth")
should have sukuun on the laam, etc. Putting them there prevents the alif
from combining. Temporary solution: omit sukuun on laam and use
alif-hamza-above instead of alif in al-ard; omit fat'ha or place it on the
alif in laa. The latter will make the fat'ha show above the laam-alif
ligature, only very slightly to the left of where it ought to be; but I
still believe that the "received" UTF-8 way of doing it would be to place
them on the laam, i.e. not = laam, fatha, alif; al-ard = alif (or
alif-wasla), laam, sukuun, alif-hamza-above, fatha, reh, sukuun, daad, any
vowel (case-ending). But doing it that way prevents, at the time being, alif
from combining with laam, not only in vim but also in IE, WordPad, etc. NS7
is even worse since it simply doesn't know about the "combining" property of
the harakat (i.e. short vowels, sukuun, shadda).

[...]
Yeah, I'm planning on adding the necessary docs and helpfile.

One thing maybe more urgently needed that the rest: what does that
newfangled 'arabic' option do? Steve noted that it seems to affect
reformatting, I noted that it seems to automagically set 'rightleft' -- what
else?

> >          * Mixing LTR (Latin text, Arabic numbers, etc.) and RTL (arabic
> > words and sentences) in a single file (be it for text and digits, for
> > bilingual text, for RTL Arabic text embedded in LTR HTML tags, etc.)
>
> You really need Bidi support within Vim for this to work properly.  Last
> I talked to Bram about this, he had no intentions of even considering the
> thought since it would mean major upheaval to how things are handled.  I
> would, on the other hand suggest the following - start vim within a Bidi
> capable terminal emulator (like mlterm).

Once the crashing bug associated with :set rightleft! is cured, LTR editing
(e.g., HTML tags) can be done in 'norightleft' mode, RTL (e.g. Arabic text)
in 'rightleft' mode. I don't see this lack of bidi-text-support in vim as a
big deal -- once it won't crash when toggling the 'rightleft' option.
>
> > Note: I haven't yet tried to make Vim understand M$'s "Arabic" option
> > of the Intrenational keyboard (my usual keyboard is fr_BE).
>
> We're only opting to support UTF-8 arabic support and are advocating
> nothing else (esp. not a proprietary encoding).
>
> Regards,
>
>  - Nadim

As I wrote in a followup to Steve, I found the UTF-8 digraphs quite adequate
for Arabic, even though some are a little weird (e.g. sn for sheh, tj for
[emphatic] taa, i+ for ghayn, etc.) The lack of one for alif-wasla is not a
big deal either, since I can either use Ctrl-V u 0671 or make my own
digraph, maybe aw or aW -- also the wasla is usually not written except in
fully vocalised text. (For instance, my Qur'an has all vowels, even sukuun
on letters of prolongation, but not a single wasla.) The lack of digraphs
for Arabic-Indic digits is a little more annoying but not a huge lot since
their codepoints are easy to rememmber in hex: U+0660 to U+0669.

Regards,
Tony.