[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Can't pdftotext convert the the Arabic text?



--- Munzir Taha <munzirtaha at newhorizons dot com dot sa> wrote:
> Has any one better luck than me? Any one knows what's is missing?
> Is it a known issue or do I need to file a bug?

I'm not familiar with 'pdftotext' (you should have provided a link
to thier homepage and/or authors - google didn't seem to provide
anything tangible) yet I have a couple of generic suggestions for ya,

 1. Find an application that does lots of PDF to ... conversions
    (to text, html, DOC, XML, etc) - the broader the better since
    it will fill multiple needs in the future (if possible).
 2. Contact the authors to see if Unicode/UTF-8 support is included
    and/or forthcoming - "ask for it".

Once #2 is added, adding proper Arabic support should be much simpler
and we should be able to do that ourselves given interest.

In passing look into the various UTF-8 supporting search engines
(mnogosearch comes to mind) to see how they convert PDF's contents
to index 'em.

Salam.

 - Nadim


		
__________________________________
Do you Yahoo!?
Take Yahoo! Mail with you! Get it on your mobile phone.
http://mobile.yahoo.com/maildemo