[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Arabic issues (was Re: [Munzir Taha] [Cooker] [Bug] LANG=xx whatever doesn't work)
- To: Pablo Saratxaga <pablo at mandrakesoft dot com>
- Subject: Arabic issues (was Re: [Munzir Taha] [Cooker] [Bug] LANG=xx whatever doesn't work)
- From: Munzir Taha <munzirtaha at newhorizons dot com dot sa>
- Date: Fri, 19 Dec 2003 07:17:33 +0300
- Cc: cooker at linux-mandrake dot com, Thierry Vignaud <tvignaud at mandrakesoft dot com>, General Arabization Discussion <general at arabeyes dot org>
- Organization: New Horizons CLC
- User-agent: KMail/1.5.3
On Monday 15 December 2003 17:50, Pablo Saratxaga wrote:
> Kaixo!
Hi (Though I don't know what Kaixo! mean ;)
> Arabic is supported in UTF-8 only;
Generally speaking, it's also supported in ISO-8859-6, right?
> so by choosing Arabic you also choose
> UTF-8 encoding.
I have some evidence now that may be the ar_SA encoding is ISO-8859-6, not
UTF-8
1. kedit saves any Arabic doc with ISO-8859-6 and I can't even choose another
encoding. As if it defaults to the system locale.
2. I can't see Arabic file/folder names from any GTK+ app. Gedit for example
can't see Arabic Folder names if UTF-8 encoding is not choosen explicitly
upon the installation. It gives this error when launched from konsole:
Gtk-Message: عظ� عععع ظ�ظععع ظ�ظ�ع ظ�عععع "\345\315\344\317" ظ�عع UTF-8
(ظ،ظ�ظ� ظ�ظ�ععع ظ�ععظ�ظ�عظ� ظ�عظ�عظ�ع G_BROKEN_FILENAMES): Invalid byte
sequence in conversion input
Gtk-Message: عظ� عععع ظ�ظععع ظ�ظ�ع ظ�عععع "\347\317\352\311.doc" ظ�عع UTF-8
(ظ،ظ�ظ� ظ�ظ�ععع ظ�ععظ�ظ�عظ� ظ�عظ�عظ�ع G_BROKEN_FILENAMES): Invalid byte
sequence in conversion input
Gtk-Message: عظ� عععع ظ�ظععع ظ�ظ�ع ظ�عععع "\322\310\317
\307\344\343\344\345\307\312!.doc" ظ�عع UTF-8 (ظ،ظ�ظ� ظ�ظ�ععع ظ�ععظ�ظ�عظ�
ظ�عظ�عظ�ع G_BROKEN_FILENAMES): Invalid byte sequence in conversion input
Gtk-Message: عظ� عععع ظ�ظععع ظ�ظ�ع ظ�عععع "\331\321\310\352~" ظ�عع UTF-8
(ظ،ظ�ظ� ظ�ظ�ععع ظ�ععظ�ظ�عظ� ظ�عظ�عظ�ع G_BROKEN_FILENAMES): Invalid byte
sequence in conversion input
3. Send Arabic text via kmail and setting the encoding to Auto-detect, send in
Arabic iso-8859-6 encoding.
4. Displaying a UTF-8 Arabic file from a shell displays garbage whereas
displaying an ISO-8859-6 shows correctly
5. more other observations regarding urpmi output in console.
All these and more is about to convince me that ar_SA != ar_SA.UTF-8.
BTW: For languages that have a utf and other encodings, how can one revert
back to the non-utf after it's enabled during the installation?
> 3. if Arabic (or any other language btw) is installed after the install
> of the system; you would also want to edit:
> /etc/rpm/macros to add it ("ar") to the %_install_langs macro; otherwise
> the translation files coming in the rpm packages won't get installed when
> you install/update an rpm package.
Mandrake is a distro that always put a newbie/desktop user in mind, has this
been changed?
> 4.1.1 you can in fact either use KDE keyboard switcher; or instead use
> the X11 keyboard (configured trough keyboarddrake).
> it's a matter of taste.
>
> 5.2 I haven't looked at akka again yet.
>
> 7. comma: that is a translation issue. tell me which packages are concerned
> and I'll try to correct them.
You can find this in many packages such as: locales-ar. Also when launching
rpmdrake on the message "Please wait, finding available packages..." (the
comma after wait). Also, the comma in the option "All packages,
alphabetical".
> keyboard: you either use KDE or keyboarddrake (X11 in fact) to manage
> the keyboard switching; you cannot use both.
>
> shortcuts and KDE: that issue should be common to all non-latin languages;
> maybe it has been discussed on KDE mailing lists; if not, it should.
> On Gtk the keyboard sends Ctrl(or Alt)-(arabic keysym); then, it determines
> from the keyboard map which latin letter corresponds to the arabic keysym,
> and converts the result as Ctrl(or Alt)-(that latin letter).
http://bugs.kde.org/show_bug.cgi?id=69458
> eg, arabic layout has:
>
> key <AD01> { [ Arabic_dad, Arabic_fatha ] };
>
> so if you press Ctrl-Arabic_dad, and if you have a keyboard
> configuration of "fr,ar", it will look at the value of <AD01> key for
> the latin layout ("fr" in this case):
>
> key <AD01> { [ a, A, ae, AE ] };
>
> So, pressing Ctrl-Arabic_dad with a kbd layout of "fr,ar" will produce
> a Ctrl-a.
> However, if you had "us,ar", it will be:
>
> key <AD01> { [ q, Q ] };
>
> so, Ctrl-Arabic_dad would be Ctrl-q.
>
$xev
KeyPress event, serial 27, synthetic NO, window 0x2000001,
root 0x48, subw 0x0, time 12484896, (335,219), root:(340,245),
state 0x2010, keycode 24 (keysym 0x5d6, Arabic_dad), same_screen YES,
XLookupString gives 2 bytes: "ظ�"
$ xmodmap -pke |grep dad
keycode 24 = q Q Arabic_dad Arabic_fatha
whereas
$showkey from a VC gives
keycode 16 press
why are they different?
> The yes/no locale problem should be fixed, isn't it?
not yet
$ locale -c yesexpr noexpr
LC_MESSAGES
^(ن|نyYعم)
LC_MESSAGES
^(ل|لnNا)
May be there is a problem in the syntax, square brackets instead of
parenthesis? I expected it to be a regex syntax like this instead:
LC_MESSAGES
^[yYن].*
LC_MESSAGES
^[nNل].*
Regarding bug:
http://qa.mandrakesoft.com/show_bug.cgi?id=5181
you said before:
"There are severe size limitations, and the problem is that the size of the
Arabic fonts were a bit too big. To fix it the installation stage (not DrakX,
but the way the running Linux is put on memory) has to be completely
modified. It's way too late for this version. For version after 9.2 the
install method will be rethought to overcome that problem (that also affect
several other languages)."
Now, I can see the problem is solved for the Hebrew language
http://qa.mandrakesoft.com/show_bug.cgi?id=4659
What's the Arabic status in cooker.
Please Mr. Pablo, I know you are busy but we need to help each other to
improve Mandrake's localization status.
--
Munzir Taha
PGP Key available:
gpg --recv-keys --keyserver www.mandrakesecure.net F0671821
Telecommunications and Electronics Engineer
Linux Registered User #279362 at http://counter.li.org
Mandrake Club member
Maintainer of Mandrake Arabization Project Status (MAPS)
http://www.arabeyes.org/download/documents/distro/mdkarabicsupport.html
CIW Designer, ICDL, MOUS
New Horizons CLC
Riyadh, SA