Re: How to encode image produced by a recognition system?

Salaam,

On Thu, 21 Mar 2002, Chahine M. Hamila wrote:

> I think I might need more details here too. But assuming you have an
> image file of an Arabic/Jawi letter and you want it transformed into a
> unicode encoded character, your best bet is to write :
> 1) a filter that would convert your binary image in a pixel array
> 2) write a neural network that you would train by feeding it arrays
> representing different letters
> If you are not dealing with images of seperate letters, it's a lot more
> complex even. I am not aware of any arabic-related OCR technology, so a
> 2-cents worth idea would be a similar training on neural networks for
> seperating letters, with a "parallel" dichotomic work trying to estimate
> letters width for example? Say, you take an arbitrary width for a start,
> you pass it through your nn, if your program thinks the result is not
> ok, it changes the width and redo it, etc?
> I suppose there should be better approaches for seperating letters
> though.
>
> More details please for a suitable help:)
> Salaam,
> Chahine
>

You had the point, chahine. Friends of mine have already done all the OCR phases and segmented the manuscripts into separating letters (in arrays). My job is to transform the arrays into encoded characters so that the manuscripts can be saved in codes which may save the storage.Your idea to train every letter is good and acceptable but it seems complex enough though. Is there any other possible and less complex solution for it? (Why previous arabic OCR researchers always left this subject out of their reports?)

Anyway thanx Nicholas for trying to help. But if i have to type the manuscripts into text file manually, then this project is meaningless.. Another thing is ..for your information, the Malays still use the 'old malay'/Jawi script until now. Even their children are also being taught how to read and write it in school but they just don't use it widely in daily business.

TQ,

Nia Azniah