[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: BIDI tags Unicode values



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Fri, 30 Aug 2002 12:03:49 -0700 (PDT)
Nadim Shaikli <shaikli at yahoo dot com> wrote:

> I certainly don't mean to take anything away from Mr. Sameer's (and
> your) exceptional work and efforts, but "keyboard emulation" is
> there and has been there in most Arabic supporting editors
> (emacs/vim/yudit/etc).

yes I was only proposing that we think of more ways of using this
feature, it was also not the obvious direction for katoob to go since
there was no pressing need for it, you could just configure Xkb.

> I'm not sure what this means, could you elaborate a bit ?  What's
> wrong with writing a UTF-8 html file (along with noting the encoding
> format in the document's header) and have the browser do the correct
> display.
nothing wrong at all, however Numerical refrences exist and are used.

> Are you talking about old browser support or coming up with
> solutions for browsers that don't support UTF-8 ?  I guess I'm
> missing some scenario where the above doesn't work; if so could you
> please present it.
no it only works with browsers that support unicode, the difference is
that it displays correctly when the whole document encoding is not se
to UTF, so you can just embed some arabic text in a non UTF encoded
document or use it in documents that don't set encoding at all, both
are not good HTML practices at all I know but its not up to katoob to
educate you :-)
and this also means that the text displays correctly even if the user
overrides the current encoding by choosing an encoding manually

here is what the w3c says about them
	"A given character encoding may not be able to express all
	 characters of the document character set. For such encodings,
	 or when hardware or software configurations do not allow
	 users to input some document characters directly, authors may
	 use SGML character references.	Character references are a
	 character encoding-independent mechanism for entering any
	 character from the document character set. "

http://www.w3.org/TR/REC-html40/charset.html#h-5.3

I still believe that it is important to have this feature
a couple of days ago I was experimenting with writing Arabic in Galeon
and Mozilla, submitting to forums or sending myself emails from
webmails.
when I  didn't explicilty choose an encoding for the browser and  let
Galeon/Mozilla use the original webpage's encoding it sent the text in
numerical character refrences, in the forum this displayed correctly,
and was fine, the webmail I sent myself on the other hand looked like
this
&#1575;&#1607;&#1604;&#1575; 
&#1610;&#1575;&#1583;&#1606;&#1610;&#1575;

to read this I had to add html tags to it and open it in a web
browser.

now this means that at least until now if a GNU/Linux user uses to
another GNU/Linux user galeon/mozilla to send Arabic mails either she
has to remember to  set up the encoding manually (since most webpages
are not using UTF yet) or the one who receives the mail has  to add
some HTML before being able to  read the mail, and has to copy an
paste to some editor to be able to  edit and respond to the mail.
now if the Arabic text editor could read these references the problem
would be solved.
at least partly, it is still a problem if  the mail is sent to someone
who isn't using Linux, but choosing the  encoding does not solve all
problems either (how many mail clients support UTF-8) etc.

the interesting thing is that at least on the Linux-Egypt.org forum
character references are the best way to post Arabic text since the
forum does not set an encoding and  members send posts with whatever
encoding they feel like (its mostly Windows 1256 encoding but you get
the  occasional UTF post) so you always have to choose encodings
manually to read Arabic posts, sending numerical references avoids the
problem altogether.

cheers,
Alaa
- -- 
Perilous to all of us are the devices of an art deeper than we
ourselves
possess.
                -- Gandalf the Grey [J.R.R. Tolkien, "Lord of the
Rings"]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)

iD8DBQE9b+IzqIWBQ7ecPHcRAvLvAJ0fulAEVAOriraMhMOQAU1b5tyLEwCfeg7b
kHetDOL+zr0HfSvnBKo/iZ8=
=Y01s
-----END PGP SIGNATURE-----