[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Volunteers for verifying the quran data

Salaam Meor,

Insha'Allah we have the full intention to do a verification of your Quran text data by counting each base letter in each individual verse and comparing these verse by verse counts with our text, which was encoded completely seperately than yours, so the likelihood of the same error being in both your text and our text is very very small. Actually we have already started this verification but we got busy due to travels (I am in Turkey right now). God willing I want to continue on this kind of verification once I get back. Once base letters are verified than a visual verification of the correctedness of the actual marks can be made. Since the base letters are the most important we are planning to do that first.

God bless you for your hard work.

Kind regards,

---------- Original Message ----------------------------------
From: Meor Ridzuan Meor Yahaya <meor dot ridzuan at gmail dot com>
Reply-To: General Arabization Discussion <general at arabeyes dot org>
Date:  Mon, 11 Jul 2005 09:15:38 +0800

>Let me explain some of the questions/queries 
>First, Nadim's questions. As of now, no one have gave me any feedback
>whatsoever, except for Mete, which he said that he had done some
>character counting for some sura's. This could mean one of 2: someone
>is working on it and not find any problem, thus not reporting any, or
>2, no one is doing any proofread. It seems to me it is no 2. As for
>PDF, the problem is the communication between me and the site
>provider, Mr Thariq. He's in Pakistan, and I'm in Malaysia. As we
>know, internet link to Pakistan still have problems, thus I can't
>update the site. (I don't have admin access yet) I will upload the PDF
>asap. If you want me to mail it to you, let me know , then you can
>upload it to arabeyes CVS.
>Second, Greg's concern on the enconding approach that I use. I agree,
>you do have valid points, but probably you missed some of my ealier
>mail mentioning the reason I choose that options. FYI, with current
>font and data, I can create a web application that will be able to
>play the recitation of each aya, and display the translation of the
>aya. I think it is very useful state already. The real  reason I
>choose that option is as follows:
>1. Fonts  availability. As far as I know, there is not a single font
>can really display the quran as in the Madinah mushaf from unicode
>text. The best that we can get, I think is MS' Arabic Typesetting, and
>2 fonts from SIL (Mr Thomas mentioned to me he is working on one font,
>but not release it yet) . I just discovered the font from SIL
>recently, maybe they just released it. I've seen the font from MS, it
>is complete and well behave according to Unicode, but still lacking
>some glyph needed to render the Mushaf in a complete manner. One thing
>I can tell you, I'm not impressed with the font, other than it being a
>complete unicode font. Further more, it is very difficult to get. It
>only distributed with MS Proffing tool, which is sold seperately from
>Office2003. The SIL font look acceptable, but it is very basic font.
>It does not have any aleternative glyph, for example for the character
>yehnoonfinal. Also, the font is not GPLed or other opensource
>licensed, so not sure we can distribute it or not. On the other hand,
>I think the font that I'm working on have a better looking glyph which
>is taken directly from  madinah mushaf, as you can see from the
>screenshot at the website. The font distributed at the site is a very
>old font, which is based on arabeyes-ar font, which is based on
>KACST-QURN font. I'm trying to update the site with the new font, but
>have some problem with link as I mention above. The latest font that
>I'm working on will have the best feature. You can colored each glyph
>individually, even for a complex ligature such as lamalef. I've sent a
>screenshot showing the font features to Mete, maybe he can give some
>opinion about it. So, in short, even if I make compatible to other
>font, it still have problem displaying it properly. My font probably
>wont be GPLed, but I will allow for non-commercial distribution. (this
>is totally a seperate topic/issue)
>2. The encoding approach. I choose it because that is the easiest way
>so that I can implement the font which will work on most platform,
>that is MS Windows and Linux. For example, the fathatan + small low
>meem to indicate the sequential fathatan. Why small low meem? well,
>I've tried so many combination, and that turn's out to work the best.
>Under windows specifically, if I choose other code point, it will do
>the following: change the direction of the text, which I can't fixed
>it even putting the RTL or LTR mark, and change the shape of the
>character before and after the codepoint. These 2 problems are not
>easy to fix or hack. By using my approach, it will neither change the
>text direction nor the shape of the character  around it, thus easier
>hack to implement it in the font domain. Plus, it does not have any
>other meaning or use in the quran.
>Having said that, you suggest me to use personal code points and
>create a font that will display each character and mark individually.
>That is find, and not very hard to implement actually, but the
>question is what is the benefit that I will get by doing this other
>than the extra work? To let everyone know, my main source of the text
>is NOT the xml file. The XML file is translated from other sources.
>Thus, I can just change the map for each character, and get is
>translated again in few minutes. So, to implement it the way you
>mention it, it probably won't take more than an hour to create the new
>file, but again, what is the benefit? We can't render it properly.
>Will we get more people involve in proofreading it? I don't see that
>will happen either, beacuse the process will be more difficult. Will
>you proofread it if that is the case? If so, let me know, I'll sent to
>you the file personally so you can start.
>Conclusion: proofreading have not made much progress. Actually, I
>don't mind if other people take the lead, as long as people let me
>know what's going on, where the mistakes is etc, so I can make the
>necessary changes.
>FYI, now I'm working on merging my work with openburhan. Openburhan is
>available on the net , basically will give you the root word of the
>quran's word. It is a very good effort. However, I found out some
>mismatch on the word count (not character count, which will never
>match) of my text with openburhan text. I've not investigate if
>further why, but I'll let everyone know the results.
>On 7/10/05, Gregg Reynolds <gar at arabink dot com> wrote:
>> Mete Kural wrote:
>> >
>> > Another nice program to use that shows Unicode codepoints
>> > automatically as you edit is UniPad: http://www.unipad.org/main/
>> >
>> And I also realized that there is another, graphical method to
>> edit/verify the underlying text data.  That is for Meor to create a
>> "proofreading" font by copying his Quranic font and changing the mapping
>> tables to make the mapping transparent.  I.e., so that each visible mark
>> is produced by exactly one codepoint in the underlying textual data.
>> Then, for example, if you see a low small meem in the rendered display,
>> you know that it corresponds to exactly one <low small meem> in the text
>> stream.  Conversely, every <low small meem> in the text stream produces
>> exactly one low small meem glyph and nothing else.
>> A proofreading version of the font might make for a somewhat ugly
>> display, but for proofreading purposes that is ok.  Once the text is
>> certified, one would use the fancy font to get proper rendering.
>> -gregg
>> _______________________________________________
>> General mailing list
>> General at arabeyes dot org
>> http://lists.arabeyes.org/mailman/listinfo/general

Mete Kural
Touchtone Corporation