Seems like we're becoming close to the answer now.
Generally speaking, putting aside Qur'aan, searching inside, say
a book written by someone, can be devided into two extreme cases:
google-like letter-by-letter one, and keyword/thesaurus based
index searching.
If one is to use the literal matching, they should much or less
understand the author's phraseology or if in the language which allows
mutiple spellings, also spelling habits. There's more, difference of
meaning by locality, time, etc.
As for a keyword index, first, let's imagine an index of a book. Since
we're in the cyber age, it can be integrable into literal search
engine, so that users can search, being unaware of the existence of
the index. You should pick up keywords first. Then relation between
terms are also exhibited in a good index. So under the heading of
"computer", there's sub-entries like "computer science", "computer
virus", and it reads "see also server, PC, calculator". Maybe behind
it is a tree-like diagram from general words indicating genre down to
more specific ones.
(I don't know the name of such study. It is related to bibliography.)
I have never experinced it, but indexing appears to me a real
demanding task. Even if aided by PC power, it consists of human work
and work and work. But those who wish to explore through the
book will benefit greatly. And by indexing, you understand the book deeply.
Now forget all what I said above, and come back to the Qur'an. Since
it is al-Qur'an, it has nothing to do with a particular novelist's
writing peculiarity. Yes, it IS the one. But not so simple; it is surely
difficult to search through for many people.
One thing clear is that there're Arabic-speaking users, and
non-speaking, probably muslims. Needs vary among them.
Perhaps clarifying the next coming job would reduce the complexity.
What Nicholas (Heer) and Mete referred to appears a high peak to climb
for the first time, although marvelous once accomplished. And if it is
the one to tackle with, it may well be worth an independent new
project, related but outside of "Qur'an" project, because it is quite
applicable for other classical materials. I think your concern on yeh
and hamza can establish a "better-then-google".
Good afternoon.
"Oibane"
_______________________________________________
General mailing list
General at arabeyes dot org
http://lists.arabeyes.org/mailman/listinfo/general