[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Arabization, techniques and problems



Yes, agree that regex need a better arabic support. Current perl regex
does have a group for whitespace, so I think first logical thing to do
to have a group for arabic marks (or stackers as you call it) Then,
the search should have the option to ignore the stackers totally or
not.

As for kdiff3, I don't understand why the developer(s) need to
re-implementing arabic shaping behaviour? It is a kde apps, thus I
think kde does support arabic natively. The same goes to gnome and
Windows. Windows uses uniscribe, gnome uses pango, but I'm not sure
what kde uses. For example, under windows, using  C# .net, I just
assing a string to a text lable widget, and the rest is taken care of.
Be it arabic or not, a proper shaping behaviour and direction will
take place. I know the default kde text editor can display arabic text
without any problem. Maybe the developer does not have the appropriate
font to test it.(anyway, i've not done any kde programming)

Regards. 

On 7/10/05, Gregg Reynolds <gar at arabink dot com> wrote:
> maysara a wrote:
> > Salam all,
> >
> > Im trying to initiate an open source community in my
> > university, and i need to intriduce students to open
> > source and to contribute to already existing software
> > and documentations and other.
> >
> 
> Here's another project you might find interesting.  Enhance regular
> expression syntax to support Arabic-specific search, and then implement
> your ideas in GNU "grep".
> 
> Here's a brief example of what I mean.  Standard regexes use the
> metacharacter "." to mean "match any single character".  So a search
> pattern like "k.b" will match ktb, krb, etc. but also kab, kub,
> k<sukuun>b, etc.
> 
> Which is fine; but in Arabic we may want to ignore "stackers" (fatha,
> shadda, etc.).  So we need another metacharacter that means "match any
> non-stacking character".  Suppose we use ":" with this meaning.  Then
> the search pattern "k:b" will match ktb, krb, etc., but *not* kab, kub,
> k<sukuun>b, etc.
> 
> If you start by asking "what kinds of searches might an Arabic speaker
> want to do" and then think about how regexes could make such searches
> natural and easy, you can come up with a lot of ideas.
> 
> -gregg
> 
> 
> _______________________________________________
> General mailing list
> General at arabeyes dot org
> http://lists.arabeyes.org/mailman/listinfo/general
>