[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to encode image produced by a recognition system?



Salaam,

Nia Azniah a *crit :

   Salaam.. i'm new to this subject... anyway i need to solve a post-processing phase for a Jawi ('Old Malay', origin from Arabic script) manuscripts 
   recognition system.. hope to get help from this list ...
Cool, looks like serious stuff;)

   1. This particular recognition system produce binary images of Jawi/arabic scripts.
More details please. What do you provide in input for example?
 How do we transform the particular image to match the codes given 
   by UNICODE? (In other words, how do we make the result of a recognition system to be accepted and used as UNICODE /UTF-8 encodings?)
I think I might need more details here too. But assuming you have an image file of an Arabic/Jawi letter and you want it transformed into a unicode encoded character, your best bet is to write :
1) a filter that would convert your binary image in a pixel array
2) write a neural network that you would train by feeding it arrays representing different letters
If you are not dealing with images of seperate letters, it's a lot more complex even. I am not aware of any arabic-related OCR technology, so a 2-cents worth idea would be a similar training on neural networks for seperating letters, with a "parallel" dichotomic work trying to estimate letters width for example? Say, you take an arbitrary width for a start, you pass it through your nn, if your program thinks the result is not ok, it changes the width and redo it, etc?
I suppose there should be better approaches for seperating letters though.

   a) What is the best format to save the image?
Save an image of what?

   b) Do i need to build databases for the images first then the codes?
In the case of character reckognition you need to build trained neural networks.

   I don't have much ideas about it...please help
More details please for a suitable help:)
Salaam,
Chahine