[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Internal number storage



Hi,

Nadim Shaikli a *crit :

> --- Mohammed Elzubeir wrote:
> >
> > > For example, in Arabic, some Arab countries use the Hindi
> > > number system, and others use the Arabic numerals (thankfully).
> > > When writing an application, should the numbers be stored in
> > > their ASCII encoding, and then rendered to either Hindi or
> > > Arabic at the option of the user, or should it be stored as
> > > Hindi numbers (internally) and then have the user choose between
> > > the Arabic numerals vs. Hindi numerals?

There is no standard AFAIK, but practice has it that the numbers are stored in
their ASCII (or other standard) encoding, and let the user chose.

> (snip)
>
> Is this a stated "opinion" or fact ?  In other words, what does the
> world in large out there expect from an Arabic encoded document ?

What was described above is the way that has been used in practice so far.

> I can see two issue here,
>
>  1. Person write a document in which both "hindi" and "non-hindi"
>     numbers are used, how is the distinction made (if all numbers are
>     stored as ASCII) ?  if its based on context is this procedure
>     formalized somehow (to get consistency among the various
>     applications) ?

I am not aware of any standard (though I believe the easiest one to ask in
this case is Ahmed Abdel Hamid, Acon's author, who did some work on context
dependent Hindi numbering). The practice I am aware of though is that if we
want a context dependent number display, it'll be treated the same way
"neutral bidi" characters will be treated. Note that here it's an application
level job as well.
I am very much for this practice of storing numbers and let the user decide as
well, since storing numbers just as numbers and not how they should be
displayed avoids much coding headache for the developer such as rewriting
numeral data treatment functions, and gives a lot more flexibilty for the user
since this one can chose to force them all in the style he wants, or simply
change them through a trivial font change, or chose a context dependent mode
that will be computed in a bidi like approach.

>  2. I'm very inclined to the think the answer to #1 above is that
>     its up-in-the-air.  Reason being, ISO-8859-6 includes the glyphs
>     to those "hindi" numbers and considers them proper "Arabic" numbers
>     (they are NOT in form-B for so-called "shaping").  Which would lead
>     one to believe those encodings ought be used instead of ASCII
>     (excuse the conjecture on my part :-)

This is a Bad Thing. Use them only if you want to force the user to see Hindi
numbers and if you don't intend to use any of the zillions of existing numeral
functions that need string<->numeric operations.

> What the user sees is NOT a problem (except in the case where both
> number systems are intermixed in a document), so let's no get into
> preferences and what people expect to see.  We are strictly talking
> about what ought to be stored on disk.

This is what it is about. The app, or more simply the font used for display,
can simply determine things if the storage is uniform. This has been the
practice so far, and the practice so far was Good.

(snip) (snip) the rest is already answered,

Salaam,
Chahine