[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Translate-devel] encoding problems with po2moz
- To: Translate Development <translate-devel at lists dot sourceforge dot net>
- Subject: Re: [Translate-devel] encoding problems with po2moz
- From: Ayman Hourieh <aymanh at gmail dot com>
- Date: Mon, 15 Nov 2004 11:48:14 +0200
- Cc: Documentation and Translation <doc at arabeyes dot org>
- Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:references; b=JAn861xQ2xgIPp/Iz+M2g6HNeBk7lz0AvXd00BzYWEvR6MMM/8zqRtLycgMWK3GDxJBsWljcp+VyRMS6awgjx8Rv/XgpPExoSsAFJTQDHnKi/ahoR06lKN9mWYxRiMXtahccvPDE47XDfyqhW/xdQUnf5is3zvMY3YtC3xQSWgM=
Hi,
On Mon, 15 Nov 2004 09:09:25 +0200, Dwayne Bailey
<dwayne at translate dot org dot za> wrote:
>
> It shouldn't break things though as you should be allowed to add your
> own comments. But at least we know what caused the problem.
>
Looks like it's one reason for the problem, I did a diff for some
other .po's and their original .pot's, the files are identical except
for the header and msgstr lines, still, converted files have wrong
encoding.
> This is what I origianly thought the problem was and am glad you
> confirmed it. My guess is that the template DTD is in ISO-8859-15 and
> thus when you merge the Arabic in it all gets messed up.
>
I did the following:
$ uniconv -in addBookmark.dtd -out tmp -decode iso-8859-1 -encode utf-8
$ mv tmp addBookmark.dtd
Still no go, could it be that the scripts are still using iso-8859-x
for encoding the resulting files?
> I'm not sure we can force it without knowing the input encoding but
> since its most likely to be 8859-1 we could ensure that we convert it to
> UTF-8 before merging.
>
> What is BOM?
>
Byte Order Mark, Mozilla doesn't like it in its .dtd's, some of the
converted files have it, I have to edit the resulting files with a
Unicode editor to remove it, I wonder if it's possible to prevent the
scripts from adding it in the first place.
I'm using the en-US files as a template, so I guess the encoding would
be iso-8859-1.
Ayman Hourieh
PS: I'm CC'ing this to the Arabeyes doc list, to keep everyone who's
involved informed, should have done it from the beginning.