[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Small Alef (was Re: Standalone Superscript Alef (Item 8))



On Saturday 26 June 2004 10:19, Mete Kural wrote:
> Salaam Mohammed,
>
> > I would please ask you to at least read the point
> > about Normalization of
> > the Qur'an text in this post carefully and comment
> > on it.
>
> I can think of alternative ways of doing the
> normalization you referred to. A more intelligent
> algorithm could detect alef_maksura+superscript_alif
> and other similar sequences and normalize them
> accordingly.

This won't work, since it's not possible to detect a superscript alef because
it's a vowel sign and can exist on top of ANY letter not just alef_maksura,
there are not any pre-defined letters/sequences that a superscript alef can
only be attached to, it can attached to anything and of course this is not
specific to the Qur'an so you can't just hard-code sequences because the same
sequence can be used as a superscript alef or as a small alef (the
superscript alef is not specific to the Qur'an at all).

Hence, it would require the normalization algorithm to know where the exact
locations of superscript alefs in the Qur'an and it wouldn't be usable for
anything else

For example, if a document quotes a verse from the Qur'an and that document
needs to be normalized for spellchecking and it contains the misspelled
non-Quranic word:
برءٰؤا
(It's considered misspelled because a small alef cannot be used here, and
thus it's a superscript alef and hence there is a missing alef here)

yet, it have some verses of the Qur'an and one of them has the correctly
spelled word:
برءٰؤا

(It's considered correctly spelled because in Qur'anic texts, small alef may
be used)

After normalizing the two words they become the same word:
برءاؤا
And the spellchecker reports that there are no spelling mistakes although
there is one in the first word.

Let alone shaping and rendering problems, for example how can you describe
that only character in ArabicShaping.txt.
It's stated as a transparent character that has no effect in the shaping
process and this is correct for the superscript alef.
But for the Small Alef, it has effects in the shaping process and cannot be
considered transparent.

And will you put a dotted circle below it in the code chart or not?
Adding such a circle is okay for the superscript alef but what
about small alef which is a base character not a NSM?


> Such intelligent algorithms are necessary
> in Arabic Quran searching anyways; for instance in
> order to detect instances of the word "Allah" in
> various grammatical contexts without including words
> such as "allahumma" (refer to Abdulbaki's Quran
> index).
>

No, they are not really necessary.
Any general algorithm should be able to search in the Qur'an (for
the example you gave, it's really really simple because the first
four letters of allahumma is the word Allah).

I agree that advanced searching options requires a dedicated search engine
but for something as simple as differentiating between a letter and a vowel
sign, it would be completely wrong to rely on special algorithms that are
specific to the Qur'an and that are not going to be implemented in a simple
text editor (that is, A simple text editor must be able to differentiate
between letters and vowel signs without requiring special algorithms for
every book out there).

A text editor must be able to recognize a letter from its properties in
Unicode and must recognize a vowel sign from its properties in Unicode
not by hard-coding sequences and such (which will still not work anyway).

> We have sent each other tens of emails already
> regarding this superscript alef issue and we do not
> seem to agree on it. If I had the time right now I
> could discuss this dagger alef issue with you further,
> but unfortunately I do not. That is why I am not able
> to respond to the points you make in your last email.
> Insha'Allah in time our differences may resolve.

You should be able to recognize that this is not a problem between both
of us, I'm not saying my opinions here, I'm stating facts.

But since I don't like to be the one who is holding back, I keep discussing
facts with you and I keep noting the various reasons why your suggestions
won't work.

However, I don't think that Unicode would encode an Arabic letter and a vowel
sign using the same codepoint even if your solutions were working.
This is not logical at all, vowel signs have nothing to do with Arabic
letters.

BTW: No need for that "dagger" attitude.

> But 
> at this time I will suggest you that if you wish to
> submit a proposal for a new dagger alef codepoint
> please do it as a seperate proposal. This way we
> jointly propose on items that we agree on, and submit
> seperate proposals for items that we do not agree on.
> It is better than seperately submitting two different
> proposals.
>
> If you wish to get familiar with the process of
> submitting a proposal to Unicode, we can submit the
> joint proposal first so that you gain that experience.
> And afterwards if you wish you can submit your dagger
> alef proposal since you will already know the details
> of submitting a proposal.
>
> As I am telling you, unfortunately I won't be able to
> allocate more time in the near future for discussing
> the dagger alef issue with you. We may return to it at
> a later date when I have more time.
>
> Please respond to us regarding your decision.
>
> Kind regards,
> Mete

Do you think that I'm not busy?
We are all busy and time can always be allocated but for something as
important as the Qur'an, I'm willing to sacrifice anything else (including
a job).

Anyway, this is not applicable here since we have plenty of time and no need
to rush.

We will wait for you until you have some time to allocate for this extremely
important issue.

I think also that this will leave some time for all of us.

-- 
Mohammed Yousif
Egypt