[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Kdiff3:arabic_shape

On Wednesday 12 January 2005 01:01, Nadim Shaikli wrote:
> In passing, Gregg has also brought the subject of ignoring some
> characters while diff'ing and to that end I suggested we go straight
> to 'diff' utility developers (instead of adding this feature to the
> overlaying GUI).  I then posted this email,
>   http://lists.gnu.org/archive/html/help-gnu-utils/2004-12/msg00035.html
That's one reason, why you want to use KDiff3. Look at 
and also read about how KDiff3 treats whitespace. BTW KDiff3 highlights the 
differences within a line (character by character.)

> > Gregg already gave me much information on how the shaping works, but
> > perhaps you can also help. The main difference between vim and KDiff3 is
> > that within KDiff3 the internal data is in Unicode-16, whereas in vim it
> > is in UTF-8.
> I don't think VIM is the best example to follow - but it's out there :-)
> I'm not sure why you are using UTF-16 and what advantage you are getting
> out of it (there are all sorts of endian issues involved among others).

The internal presentation is actually QString. So Qt handles the 

Because KDiff3 highlights the differences on a character-by-character basis I 
thought that the handling could be similar to what vim does.

> Here are some links that talk about this topic in more detail that might
> help,
>   http://www.unicode.org/faq/utf_bom.html
>   http://czyborra.com/utf/
>   http://www.cl.cam.ac.uk/~mgk25/unicode.html
> > So I can't directly use the arabic_shape-function from vim, which is
> > difficult for me to use anyway. What I need is a function that takes
> > and gets a Unicode-16-string as input and  output. And if the output
> > string has more or less characters than the input, then I would also
> > need an array that tells which character-indices from either string
> > correspond to the other string.
> >
> > Could you please provide that or help me on how to get there?
> We can try (I've never used or looked into UTF-16 in any great detail) -
> I'm sure there are those that have tackled similar problems that will
> chime-in.
> PS: you might want to subscribe to the developer list.
>   http://www.arabeyes.org/mailinglists.php
> Regards,
>  - Nadim