[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Arabic posts



I think our mailing-list archives are broken when it comes to Arabic posts.
I came across this by shear accident (while looking into Youcef's question
about Jan 2003 - I fixed that ezmlm problem, so Jan 2003 should be OK).

After more looking into the Arabic posts, I've come to this conclusion,

 1. The tar files given to me (even the old (pre-.uk)) are NOT raw data
    and have been saved in the improper encoding (emacs ?).  For instance
    look at the 'ae-lists-20030430.tar.bz2' files (you can preview it on
    arabeyes ~nadim/maillists/doc.mbox and search for 'Nov 2002' - you'll
    see all sorts of '=D8=A7=D9=84=D8=B3=D9=84=D8=A7=D9=85' which are the
    encodings, but they are discrete ASCII characters).  I'm sure that can
    be fixed via a script or something, I'm just not sure how to go about
    it at the moment.  Mohammed, could you please look into this and fix
    it (since you might be fresh about these manipulations from your recent
    duali work) - it should have been in raw format throughout, so I highly
    suggest you do some sample checks on your .emacs setup as well (I can
    help if you like to flush out your emacs setup if that is indeed a
    problem).

 2. I have improper mhonarc settings, the NEW arabic posts sitting on our
    live 'LIST.mbox' files are all fine, the way they are being processed
    is incorrect and I will look into fixing those (they are not related
    to #1 above).  This is a much more minor problem than #1 since the
    original raw file is not touched and its just a matter of figuring out
    the proper configuration setup.

Once #1 & #2 are fixed, I will regenerate the mailing-list archives.

Salam.

 - Nadim


__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com