[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Correct sorting?
- To: Documentation and Translation <doc at arabeyes dot org>
- Subject: Re: Correct sorting?
- From: Nadim Shaikli <shaikli at yahoo dot com>
- Date: Tue, 30 Nov 2004 20:19:13 -0800 (PST)
--- Abdulaziz Al-Arfaj <aalarfaj at gmail dot com> wrote:
> On Sun, 28 Nov 2004 07:14:56 -0800 (PST), Ossama Khayat <okhayat at yahoo dot com>
> [...]
> > Well, I don't understand those 0x stuff alot ;-) but can you please
> > edit the corresponding file or point me to them?
>
> Neither do I ;-)
The 0x stuff is the encoding of the characters mentioned in case some
don't know what they look like. Those encodings are listed in the
Unicode "code charts" tables [1] (search for 'Arabic' and 'Arabic
Presentation Forms-B').
> Wait, now I remember. I translated that file, and actually each of
> those words does not end in yeh then teh-marbuta. Its actually yeh
> (0x64A) then SHADDA (a diactric/composing character of zero-width)
> then teh-marbuta (0x0629). I think I am 80% certain that this SHADDA
> sitting in the middle between the two characters is the reason why
> they aren't being joined together.
Certainly sounds like it.
> I believe the file is console-data_debian_po.po (level 2), but I was
> not saying the file needs to be edited. Having a shadda there is
> proper. The shadda is a zero-width character that should not affect
> the shaping/joining of the two characters it sits between, but maybe
> (still just guessing here) thats not whats going on in D-I. Perhaps as
> a test, we can make a copy of the file without the shaddas, and
> Christian could test it out, and see if the problem goes away? I'm
> sorry I cannot do it myself. No Linux box within reach for the next
> week or so :-(
There are two solutions here,
1. We either strip all the diacritics/harakat before we ship these files
over to debian (via a script) and this is easily done. I think we
should maintain the "correct" contents of the files in CVS and not
strip 'em and commit 'em.
2. We have slang internally ignore all the diacritics/harakat since there
are other languages that will also benefit from this added hack. In
reality though, this needs to be fixed properly and I believe Steve
(the slang champion) is aware of this, yet I'm unsure if he's planning
on doing anything on this front. It might be worth-while to ping him
on this...
So do we have time to resolve this (via point-1 above) or are we late ?
[1] http://www.unicode.org/charts
Salam.
- Nadim
__________________________________
Do you Yahoo!?
Take Yahoo! Mail with you! Get it on your mobile phone.
http://mobile.yahoo.com/maildemo