[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[OT] Re: Hamza, Unicode, Java



On Sat, 23 Nov 2002, Robert Kosara wrote:

>   But my problem is this: I have a function that translates from a
> transliteration (the one used in ArabTeX) to Unicode. It works quite well
> for most things, but one thing that just doesn't work properly is the
> hamza in combination with a letter. I tried both \u0621 and \u0654, but
> the former produces an isolated hamza, and the second only shows me an
> empty rectangle. And it's not even that Java creates a combined letter
> (that I don't have the font for), because when I write bi'r, for
> example, the ya is printed, and after it the rectangle ...

U+0654 is the answer, but it may not work because it may not be available
in your font, or the Arabic rendering engine may not support it, since it
was only added in Unicode 3.0. I would recommend doing some
post-processing, for replacing common letter followed by Hamza Above by
their respective combined character. For example, an Alef followed by a 
Hamza Above should be replaced with an U+0623, ALEF WITH HAMZA ABOVE.

>   And when I do the conversion myself (inserting a single \ufe8c), the ba
> and ra are not connected to it ...

Don't ever use U+FExx characters unless you know what you are doing (i.e. 
you are writing some piece of software on a platform that doesn't 
support Arabic rendering). They are *compatiblity* characters, and are 
somehow deprecated.

>   Is there is some documentation somewhere on how the signs are combined
> and how I can influence that? Because I would also like to be able to have
> auxilliary vowels on demand, and for that I guess I need to know a little
> more about how the things work together.

Yes, it's called "The Unicode Standard, Version 3.0", ISBN 0-201-61633-5.  
I would also recommend "Java Internationalization", ISBN 0-596-00019-7.

BTW, vowels other than Hamza Above, Hamza Below, and Madda Above should
work out of the box on your system.

roozbeh