[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: A bug in cut?



On Sun, 23 May 2004, Munzir Taha wrote:

> On Saturday 22 May 2004 09:46 am, Behdad Esfahbod wrote:
> > On Sat, 22 May 2004, Munzir Taha wrote:
> > > hexdump gives:
> > > 0000000 a0d9 d90a 0aa1 a2d9 d90a 0aa3 a4d9 000a
> > > 000000f
> > >
> > > Can you explain to me how to relate these numbers to
> > > 0660	ARABIC-INDIC DIGIT ZERO
> > > 0661	ARABIC-INDIC DIGIT ONE
> > > 0662	ARABIC-INDIC DIGIT TWO
> > > 0663	ARABIC-INDIC DIGIT THREE
> > > 0664	ARABIC-INDIC DIGIT FOUR
> >
> > Have a look at /usr/share/i18n/charmaps/UTF-8.gz
> > It's really simple.  For example, \xd9\xa0 is 0660 and so on.
> > You better read the UTF-8 RFC once, it's lots of fun.
>
> Thanks for the info. Now, I can understand how 660=d9a0 but hexdump gives it
> as a0d9. Has this any thing to do with big endianness

It's little endianness in fact, that \xd9\xa0 is the 16-bit
integer \xa0d9.  The -b parameter to hexdump withh should
one-byte octal instead.


--behdad
  behdad.org