[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Proposal for the Basis of a Codepoint Extension toUnicodefortheEncoding of the Quranic Manuscripts



Mete Kural wrote:


It still seems to me that the tanween needs to be kept intact whether there is idgham or not since the idgham is determined by the next word. Two instances of the same exact indefinite noun may or may not employ idgham based on what word follows it. If we don't keep the tanween intact, then for instance it won't be possible to search for the indefinite form of a noun and get consistent search results unless both the with idgham and without idgham forms of the words are searched. But if the tanween is kept intact then it should be possible to simply substring search for the regular no idgham form of the word and get both with idgham and without idgham instances.

I see your logic, but I think the counter-argument is stronger:

a. It's true that idgham is determined by the following word (initial consonant), but that's a higher-level issue. The idgham mark is explicitly encoded in written text using a particular letterform (distinct from "ordinary" tanween). Therefore we want to encode it; then it becomes possible to search for idgham without any special logic in the software.

b. The interpretation of tanween as a marker of indefiniteness is also a higher-level issue. If you bring it up with the Unicode folks the defense shields will go up instantly. For an encoding design, "tanween" should only mean exactly what it says: appending a nuun phoneme in pronunciation. "tanween idgham" indicates a phonological variant of "tanween".

c. Searching for an indefiniteness marker necessarily involves looking for several different marks in printed text; machine search for <tanween> and <tanween idgham> would be no different. Personally I don't view a regex involving e.g. [aui][ρΡ%] as very troublesome. If you indicate idgham with a modifier following plain tanween, you may simplify search specifications for some purposes, but at the cost of added complexity in the software to handle idgham rendering.

-gregg