[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Translate-devel] encoding problems with po2moz



Hi,

On Mon, 15 Nov 2004 09:09:25 +0200, Dwayne Bailey
<dwayne at translate dot org dot za> wrote:
> 
> It shouldn't break things though as you should be allowed to add your
> own comments.  But at least we know what caused the problem.
> 

Looks like it's one reason for the problem, I did a diff for some
other .po's and their original .pot's, the files are identical except
for the header and msgstr lines, still, converted files have wrong
encoding.

> This is what I origianly thought the problem was and am glad you
> confirmed it.  My guess is that the template DTD is in ISO-8859-15 and
> thus when you merge the Arabic in it all gets messed up.
> 

I did the following:

$ uniconv -in addBookmark.dtd -out tmp -decode iso-8859-1 -encode utf-8
$ mv tmp addBookmark.dtd

Still no go, could it be that the scripts are still using iso-8859-x
for encoding the resulting files?

> I'm not sure we can force it without knowing the input encoding but
> since its most likely to be 8859-1 we could ensure that we convert it to
> UTF-8 before merging.
> 
> What is BOM?
> 

Byte Order Mark, Mozilla doesn't like it in its .dtd's, some of the
converted files have it, I have to edit the resulting files with a
Unicode editor to remove it, I wonder if it's possible to prevent the
scripts from adding it in the first place.

I'm using the en-US files as a template, so I guess the encoding would
be iso-8859-1.

Ayman Hourieh

PS: I'm CC'ing this to the Arabeyes doc list, to keep everyone who's
involved informed, should have done it from the beginning.