[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Arabic plural forms issues





On 7/26/06, Youssef Chahibi <chahibi at gmail dot com> wrote:
السلام عليكم
   The current plural form gettext equation used in Gnome, KDE and probably in
other projects is wrong [5]. I know that QAC investigated this issues but it
didn't decide on a definitive solution, so it is clear that the current
plural form _expression_ is a temporary hack and of course confirmed as wrong.
In http://wiki.arabeyes.org/QacDecisions:

>  This GNU Plural Header will be used when we can find a way to script its
> functionality: nplurals = 7;
>  plurals = n==0 ? 0 : n==1 ? 1 : n==2 ? 2 : n%100>=3 && n%100<=10 ? 3 :
> n%100==1 ? 5 : n%100==2 ? 6 : 4;


This is true. Although we could not all agree 100% on a new alternative and elegant solution, we were all in agreement that the currently use one is incorrect. The biggest problem were the n%100==1 and n%100==2 cases. Numbers such as 101, 102, 201, 202 etc... are particularly tricky in Arabic.
 

   Arabic follows sophisticated rules to decide on the form of the "counted"
items. Moreover, there are two distinct rules, one if numbers are read from
the right to the left and an other if they are read from the left to the
right, as the form of counted items follows the last number.
   Both rules are correct, studies confirm that in the past both rules were
allowed, however reading from the right to the left - following the order of
the letters in Arabic - is the more respected rule, and nowadays medias use
reading from the left to the right.


Can you please cite some of these studies for reference?

[...]


   Plural forms 0, 1 and 2 don't require a variable, and here comes another
issue. If two variables are included in the string, say %s and %d, and
that %d is omitted this leads the application to crash (Segmentation fault)
[3]. A solution exists, it is to use variable shuffling, which displays a
correct result and doesn't crash [4]. This need be documented and tested for
other implementations other than C. (Thank you Djihed for the idea)


If I understand correctly, we are asking developers to use variable shuffling to avoid the segmentation fault. But is Arabic the only language that has plural forms that dont require a variable? If Arabic is not the only language, shouldn't variable shuffling be already in wide use when it comes to plural forms and applications that support them?
 

   One other issue is what form to use for non-integer numbers.

   We may sometimes need to translate applications that don't support plural
form for their simplicity, and use for example "user(s)" in English. I
suggest to contact the developer and ask him/her to support plural forms, if
for some reasons this can't be done, we can use "من" as in "وصل ثلاثة من
الرجال" . (Thank you Munzir for the idea).


Clever. I remember that is exactly what we used to do to handle cases such as "user(s)".
 

   What can we do?
- Comment on this
- Mark all strings that contain plural forms as fuzzy, and replace the formula
in all files with the help of a script. Correct these fuzzy strings to add
missing cases.


Youssef, I am _100% on board_ with you on this. But first of all we need to agree once and for all on a definite plural form. I am all for your right-to-left suggestion and if no one has any objection I suggest we make it official without delay. But we definitely need to hear a few other voices here.

After that we can go ahead and write a script and perform what ever is needed, but first we need a decision, then we need to document that decision in all relevant areas (Translation Guide, QAC's Wiki pages etc...).

And once you have a strong basis to work on, something we are sure is linguistically correct, and documented as such, we can go ahead and contact application deveopers with our findings, but only after we've done our homework over here :-)

Regards,

Abdulaziz,