[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: C++ Unicode for Arabic
- To: Development Discussions <developer at arabeyes dot org>
- Subject: Re: C++ Unicode for Arabic
- From: Mohammed Yousif <mhdyousif at gmx dot net>
- Date: Thu, 25 Nov 2004 06:13:09 +0200
- User-agent: KMail/1.6.2
On Wednesday 24 November 2004 09:37, Nadir Durrani wrote:
> >>Alsallam Alukum;
> >>
> >>How about for the output. does it work or not for <fstream.h>. Please
> >> advice.
>
> Oh yes it does ... here is the code that writes arabic letter alif to
> file...
>
>
> fstream obj ("file.txt" , ios::out)
>
>
> int unicode =0x6;
>
>
> int unicode1=0x27 /* unicode and unicode1 combines to form alif because
> integer is of 2 bytes in Borland C with VC you can declare them together
> */
>
>
> obj<<(char)0xFF;
>
>
> obj<<(char)0xFE;
>
>
> obj<<(char)unicode1;
>
>
> obj<<(char)unicode;
>
>
> FFFE is for Unicode file and is stored in little endian format and then you
> can store Alef as 2706 and other characters similarly...
>
>
Please don't add the Byte Order Mark (BOM), it causes a lot of problems on
POSIX systems. (e.g. Perl scripts that doesn't work) and it makes the output
file not a standard UTF-8 file anymore (this has been forced by Microsoft,
that's the reason why NotePad can't make valid UTF-8 files).
FFFE and FFFF must not occur in a UTF-8 file and if found they should be
interpreted as a malformed sequence, not as a unicode file identifier.
See:
http://www.cl.cam.ac.uk/~mgk25/ucs/ISO-10646-UTF-8.html
and:
http://www.cl.cam.ac.uk/~mgk25/unicode.html
--
Mohammed Yousif
Egypt