[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: RFC libquran: Data packaging
- To: "Development Discussions" <developer at arabeyes dot org>
- Subject: Re: RFC libquran: Data packaging
- From: "Mohsen Saboorian" <mohsens at gmail dot com>
- Date: Fri, 27 Jul 2007 09:59:01 +0330
We have an automatically generated simple (It's UTF-8 but only has
small superscript alef over Cp1256) Quran text  used for Zekr
0.6.0beta1+. It's generated based on Meor's detailed Quran text ,
and collated against another Cp1256 Quran text automatically.
Differences highlighted by this script  are verified by a group of
three people two times. The result text is here :
Here is also some Hamza shaping rules considered in order to simplify
Uthman Taha text: 
Please note that although highlighted differences generated by this
script  is verified, there might still be some typo in final Quran
On 7/27/07, Mohsen Saboorian <mohsens at gmail dot com> wrote:
> > Thanks and JAK, i didn't know that.
> > i'm gusseing the diffrences might be due to CP1256 not containing some
> > unicode chars e.g small alef.
> > and, yes please, i could use a sample of the differences
> Actually not all differences relate to lack of characters in Cp1256.
> For example usually ALEF_MAKSURA, in Cp1256 is written as only ي or ى
> without SMALL_SUPERSCRIPT_ALEF at the end of the word, however you can
> see that this is written as ALEF in a 5:31 in KFC. Here are some
> مائده: ٣١ - كينگ فهد: يَا وَيلَتَا، عثمان طه: يَـٰوَيْلَتَىٰ
> كهف: 77 - كينگ فهد: لاتَّخَذْتَ، عثمان طه: لَتَّخَذْتَ
> مريم: 74 - كينگ فهد: وَرِئْيًا، عثمان طه: وَرِءْيًا