From: Gregg Reynolds <gar at arabink dot com> Now, IMO a difficult design
question is whether some true morphemes should in fact be encoded.
Obvious examples: definite article, other particles like laa,
sawfa, sa-, direct object suffixes -hu, -ha, etc. Unicode will
never countenance something like that, but that doesn't mean we
shouldn't. Such design decisions should be made strictly on a
costs/benefits basis, IMO.
I'd like to restate my opinion here that such morphemic encoding is
better done at the markup level. So basically encode the characters
on the basis of a graphemic encoding using Unicode and then further
encode the morphemes on the markup level using an appropriate XML
schema.