[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Arabic Shaping Patch



Upon your request, I had provided the patch against the CVS HEAD .. I did it 
against the lyx-devel module in the CVS.. it seemed the most active one .. 
please tell me if there is any probelms with it .. I would be reallt happy to 
see the shaping in LyX working fine for Arabic

There is still one issue in LyX shaping that this patch does not address .. 
that is the Complex Letters .. there are some glyphs in Arabic that are a 
combination of 2 letters ( one glyph for 2 letters) .. it highly increase 
readability in a way that it is very nessesary .. I was not sure what to do 
about it, specialy that it may involve heavy changes in other files in Lyx 
that I do not want to combine with simple patch ..
If you happen to know that there is a way already in Lyx to make 2 letters 
show in the screen as one Glyph please tell me .. 

Later
Isam Bayazidi


On Monday 18 November 2002 00:18, Dekel Tsur wrote:
> On Sat, Nov 16, 2002 at 01:11:27PM +0200, Isam Bayazidi wrote:
> > I really hope that the developers check this patch, no other code that is
> > not Arabic related had been touched. We would be glad to have this
> > patched enrolled to the main LyX tree
> >
> > It was made across 1.2.1 .. I can make it across the CVS if it is
> > needed.. Please CC " developer at arabeyes dot com ", it it the mailing
> > list of developers in Arabeyes project.
>
> Please provide a patch against latest CVS.
Index: lib/languages
===================================================================
RCS file: /cvs/lyx/lyx-devel/lib/languages,v
retrieving revision 1.15
diff -p -u -3 -r1.15 languages
--- lib/languages	2002/07/03 14:18:31	1.15
+++ lib/languages	2002/11/17 23:21:21
@@ -1,7 +1,7 @@
 # name      babel name	GUI name	RTL?   encoding	  code	latex options
 afrikaans   afrikaans	"Afrikaans"	false  iso8859-1  af_ZA	 ""
 american    american	"American"	false  iso8859-1  en_US	 ""
-arabic      arabic	"Arabic"	true   iso8859-6  ar_SA	 ""
+arabic      arabic	"Arabic"	true   iso8859-6  ar	 ""
 austrian    austrian	"Austrian"	false  iso8859-1  de_AU	 ""
 bahasa      bahasa	"Bahasa"	false  iso8859-1  in_ID	 ""
 belarusian  belarusian	"Belarusian"	false  cp1251     be	 ""
Index: lib/kbd/arabic.kmap
===================================================================
RCS file: /cvs/lyx/lyx-devel/lib/kbd/arabic.kmap,v
retrieving revision 1.2
diff -p -u -3 -r1.2 arabic.kmap
--- lib/kbd/arabic.kmap	2000/07/17 13:41:20	1.2
+++ lib/kbd/arabic.kmap	2002/11/17 23:21:21
@@ -3,6 +3,7 @@
 #
 # Generated automatically from kikbd map by Adil Alsaid <alsaid at bigfoot dot com>
 #
+# reviewed and fixed by Isam Bayazidi <bayazidi at arabeyes dot org>, Mohamed Kebdani <kebdani1 at iam dot net dot ma>
 
 \kmap q Ö
 \kmap w Õ
@@ -27,7 +28,7 @@
 \kmap x Á
 \kmap c Ä
 \kmap v Ñ
-\kmap b Ð
+\kmap b äÇ
 \kmap n é
 \kmap m É
 \kmap ; ã
@@ -35,44 +36,41 @@
 \kmap "," è
 \kmap . Ò
 \kmap / Ø
-\kmap ` ;
+\kmap ` Ð
 \kmap [ Ì
 \kmap ] Ï
 
-\kmap Q î
-\kmap W ë
-\kmap E ï
-\kmap R ì
-#\kmap T ¤
-\kmap T ~
+# shifted keyboard
+
+\kmap Q ?
+\kmap W ?
+\kmap E ?
+\kmap R ?
+\kmap T äÅ
 \kmap Y Å
-#\kmap U ~
-\kmap U Ù
+\kmap U `
 \kmap I ç
-\kmap O Î
+\kmap O ?
 \kmap P »
-\kmap A ð
-\kmap S í
+\kmap A ?
+\kmap S ?
 \kmap D [
 \kmap F ]
-#\kmap G £
-\kmap G ~
+\kmap G äÃ
 \kmap H Ã
 \kmap J à
-#\kmap K º
-\kmap K ~
+\kmap K ¬
 \kmap L /
-\kmap Z ñ
-\kmap X ò
+\kmap Z ~
+\kmap X ?
 \kmap C {
 \kmap V }
-#\kmap B ¢
-\kmap B ~
+\kmap B äÂ
 \kmap N Â
-#\kmap M º
-\kmap M ~
+\kmap M '
 \kmap < ","
 \kmap > .
 \kmap ? ¿
 \kmap { <
 \kmap } >
+\kmap ~ ?
Index: src/encoding.C
===================================================================
RCS file: /cvs/lyx/lyx-devel/src/encoding.C,v
retrieving revision 1.15
diff -p -u -3 -r1.15 encoding.C
--- src/encoding.C	2002/07/21 21:20:55	1.15
+++ src/encoding.C	2002/11/17 23:21:23
@@ -103,24 +103,24 @@ Uchar tab_symbol[256] = {
 
 unsigned char arabic_table2[63][4] = {
 	{0x41, 0x41, 0x41, 0x41}, // 0xc1 = hamza
-	{0x42, 0xa1, 0x42, 0x42}, // 0xc2 = ligature madda on alef
-	{0x43, 0xa2, 0x43, 0x43}, // 0xc3 = ligature hamza on alef
-	{0x44, 0xa3, 0x44, 0x44}, // 0xc4 = ligature hamza on waw
-	{0x45, 0xa4, 0x45, 0x45}, // 0xc5 = ligature hamza under alef
-	{0xf9, 0xf9, 0xf8, 0xa0}, // 0xc6 = ligature hamza on ya
-	{0x47, 0xa5, 0xa5, 0xa5}, // 0xc7 = alef
+	{0x42, 0xa1, 0x42, 0xa1}, // 0xc2 = ligature madda on alef
+	{0x43, 0xa2, 0x43, 0xa2}, // 0xc3 = ligature hamza on alef
+	{0x44, 0xa3, 0x44, 0xa3}, // 0xc4 = ligature hamza on waw
+	{0x45, 0xa4, 0x45, 0xa4}, // 0xc5 = ligature hamza under alef
+	{0x46, 0xf9, 0xf8, 0xa0}, // 0xc6 = ligature hamza on ya
+	{0x47, 0xa5, 0x47, 0xa5}, // 0xc7 = alef
 	{0x48, 0xae, 0xac, 0xad}, // 0xc8 = baa
-	{0x49, 0xb1, 0xaf, 0xb0}, // 0xc9 = taa marbuta
+	{0x49, 0xb1, 0x49, 0xb1}, // 0xc9 = taa marbuta
 	{0x4a, 0xb4, 0xb2, 0xb3}, // 0xca = taa
 	{0x4b, 0xb7, 0xb5, 0xb6}, // 0xcb = thaa
 	{0x4c, 0xba, 0xb8, 0xb9}, // 0xcc = jeem
 	{0x4d, 0xbd, 0xbb, 0xbc}, // 0xcd = haa
 	{0x4e, 0xc0, 0xbe, 0xbf}, // 0xce = khaa
-	{0x4f, 0xa6, 0xa6, 0xa6}, // 0xcf = dal
+	{0x4f, 0xa6, 0x4f, 0xa6}, // 0xcf = dal
 
-	{0x50, 0xa7, 0xa7, 0xa7}, // 0xd0 = thal
-	{0x51, 0xa8, 0xa8, 0xa8}, // 0xd1 = ra
-	{0x52, 0xa9, 0xa9, 0xa9}, // 0xd2 = zain
+	{0x50, 0xa7, 0x50, 0xa7}, // 0xd0 = thal
+	{0x51, 0xa8, 0x51, 0xa8}, // 0xd1 = ra
+	{0x52, 0xa9, 0x52, 0xa9}, // 0xd2 = zain
 	{0x53, 0xc3, 0xc1, 0xc2}, // 0xd3 = seen
 	{0x54, 0xc6, 0xc4, 0xc5}, // 0xd4 = sheen
 	{0x55, 0xc9, 0xc7, 0xc8}, // 0xd5 = sad
@@ -143,8 +143,8 @@ unsigned char arabic_table2[63][4] = {
 	{0x65, 0xe7, 0xe5, 0xe6}, // 0xe5 = meem
 	{0x66, 0xea, 0xe8, 0xe9}, // 0xe6 = noon
 	{0x67, 0xed, 0xeb, 0xec}, // 0xe7 = ha
-	{0x68, 0xaa, 0xaa, 0xaa}, // 0xe8 = waw
-	{0x69, 0xab, 0xab, 0xab}, // 0xe9 = alef maksura
+	{0x68, 0xaa, 0x68, 0xaa}, // 0xe8 = waw
+	{0x69, 0xab, 0x69, 0xab}, // 0xe9 = alef maksura
 	{0x6a, 0xf0, 0xee, 0xef}, // 0xea = ya
 	{0x6b, 0x6b, 0x6b, 0x6b}, // 0xeb = fathatan
 	{0x6c, 0x6c, 0x6c, 0x6c}, // 0xec = dammatan
@@ -252,6 +252,19 @@ bool Encodings::IsComposeChar_hebrew(uns
 		c != 0xce && c != 0xd0;
 }
 
+
+// Special Arabic letters are ones that do not get connected from left
+// they are hamza, alef_madda, alef_hamza, waw_hamza, alef_hamza_under, 
+// alef, tah_marbota, dal, thal, rah, zai, wow, alef_maksoura
+
+bool Encodings::is_arabic_special(unsigned char c)
+{
+	return 	(c >= 0xc1 && c <= 0xc5) ||
+		 c == 0xc7 || c  == 0xc9  ||
+		 c == 0xcf || c  == 0xe8  ||
+		(c >= 0xd0 && c <= 0xd2) ||
+		 c == 0xe9;
+}
 
 bool Encodings::IsComposeChar_arabic(unsigned char c)
 {
Index: src/encoding.h
===================================================================
RCS file: /cvs/lyx/lyx-devel/src/encoding.h,v
retrieving revision 1.8
diff -p -u -3 -r1.8 encoding.h
--- src/encoding.h	2002/03/21 17:25:09	1.8
+++ src/encoding.h	2002/11/17 23:21:23
@@ -90,6 +90,9 @@ public:
 	bool IsComposeChar_arabic(unsigned char c);
 	///
 	static
+	bool is_arabic_special(unsigned char c);
+	///
+	static
 	bool is_arabic(unsigned char c);
 	///
 	static
Index: src/text.C
===================================================================
RCS file: /cvs/lyx/lyx-devel/src/text.C,v
retrieving revision 1.280
diff -p -u -3 -r1.280 text.C
--- src/text.C	2002/11/07 00:37:09	1.280
+++ src/text.C	2002/11/17 23:21:38
@@ -150,12 +150,14 @@ unsigned char LyXText::transformChar(uns
 		}
 
 	if (Encodings::is_arabic(next_char)) {
-		if (Encodings::is_arabic(prev_char))
+		if (Encodings::is_arabic(prev_char) &&
+			!Encodings::is_arabic_special(prev_char))
 			return Encodings::TransformChar(c, Encodings::FORM_MEDIAL);
 		else
 			return Encodings::TransformChar(c, Encodings::FORM_INITIAL);
 	} else {
-		if (Encodings::is_arabic(prev_char))
+		if (Encodings::is_arabic(prev_char) &&
+			!Encodings::is_arabic_special(prev_char))
 			return Encodings::TransformChar(c, Encodings::FORM_FINAL);
 		else
 			return Encodings::TransformChar(c, Encodings::FORM_ISOLATED);