[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: start of SIRAGI project



On Thu, 7 Apr 2005, Tarik FDIL wrote:

> Files dealing with tiff are in SIRAGI, simply to read image files ;-) In the
> same way in GOCR source directory you find files pnm.c, pcx.c and tga.c to
> read pnm, pcx and tga image files.
>
> Why including and excerpt of libtiff files and not simply telling developers
> to use libtiff ? First to simplify developement since you have all files in
> the same directory. Second I have made some modification in Makefile to adapt
> libtiff to unix and windows. Now, I'm not against removing these files from
> CVS if other contributors agree with that.

That's exactly the point.  Just use libtiff, libpng, and libjpg
and you have supported all formats that matter.  Having libtiff
files in your source code just makes it look bigger than it
really is.

> > In fact, I belove
> > an Arabic OCR application is out of place by definition too, the
> > same for an Arabic editor, an Arabic spell-checker, etc.
>
> - First  :
> This is an old discussion : why coding KDE and GNOME ? why writing GNU/Linux
> while BSD exist ?, etc. IMHO free softwares offer many solutions for the same
> problem since there is many ideas and many falvours of the same
> functionnality. I'm not saying that we should reinvent the wheel, no, I think
> if a new project give some new ideas, a new design or a new approach it
> should be done.

No, you are not getting my point here.  We don't need an English
editor, and Arabic editor, etc.  We need an editor that can do
them all.  If you think GOCR is crap and you want to invent a
better wheel, fine.  But don't call it "Arabic OCR", aim for
supporting every script and language.


> - second :
> I already tried GOCR and I had read its documentation and look at its source
> code. His design is to specific to latin characters, I can't see what I can
> do to adapt it to arabic without all rewriting. Here some examples :

I think at the very minimum you can reuse code for handling
images, rotation, etc.  I didn't talk about algorithms at all.


> * If you consider line detection, the algorithm assumes that character are
> written from left to right. If you want to address this issue you should
> rewrite the entire horizontal segmentation. That's what I have done in
> SIRAGI.
>
> * if you see what GOCR's author call "cluster detection". This is the tool to
> detect characters in a line. If you apply this algorithm to arabic OCR you
> will get words not characters. An OCR should recognize characters not words
> since there is only 28 characters but an infinity of words :-)

Sure.  You need another algorithm for cursive scripts.  Just add
it to the same engine.  So people can OCR mixed Latin/Arabic
text.

> * concerning the heart of the GOCR, the OCR engines. They are not general, but
> specifically designed for latin characters. There is no neural networks nore
> classification using a general pixel comparisons, nore vectorization. So no
> line of code can be adapted from these engines  :-)

Add them to that.  redesign such that the current code is one
module of the whole thing and Arabic can be done as another
module.  I'm sure GOCR is doing a decent job for Latin.


> Conclusion : I think SIRAGI-OCR is really a necessity. We have no other
> alternative than writing from scratch a new software with a more general
> design to address arabic texts. Then, later we can easily adapt it to
> recognize latin characters !

So, please design your's with multiscript support in mind.


> Anyway, thanks Behdad for alerting us not reinventing the wheel but I think,
> sincerly, we are not falling in this trap.
>
> Best regards
>
> Tarik

Cheers,
--behdad
http://behdad.org/