[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: A bug in cut?
- To: Development Discussions <developer at arabeyes dot org>
- Subject: Re: A bug in cut?
- From: Behdad Esfahbod <behdad at cs dot toronto dot edu>
- Date: Sun, 23 May 2004 00:31:30 -0400
On Sun, 23 May 2004, Munzir Taha wrote:
> On Saturday 22 May 2004 09:46 am, Behdad Esfahbod wrote:
> > On Sat, 22 May 2004, Munzir Taha wrote:
> > > hexdump gives:
> > > 0000000 a0d9 d90a 0aa1 a2d9 d90a 0aa3 a4d9 000a
> > > 000000f
> > >
> > > Can you explain to me how to relate these numbers to
> > > 0660 ARABIC-INDIC DIGIT ZERO
> > > 0661 ARABIC-INDIC DIGIT ONE
> > > 0662 ARABIC-INDIC DIGIT TWO
> > > 0663 ARABIC-INDIC DIGIT THREE
> > > 0664 ARABIC-INDIC DIGIT FOUR
> >
> > Have a look at /usr/share/i18n/charmaps/UTF-8.gz
> > It's really simple. For example, \xd9\xa0 is 0660 and so on.
> > You better read the UTF-8 RFC once, it's lots of fun.
>
> Thanks for the info. Now, I can understand how 660=d9a0 but hexdump gives it
> as a0d9. Has this any thing to do with big endianness
It's little endianness in fact, that \xd9\xa0 is the 16-bit
integer \xa0d9. The -b parameter to hexdump withh should
one-byte octal instead.
--behdad
behdad.org