[Decapod] Testing OCRopus

Jonathan Hung jhung.utoronto at gmail.com
Tue May 5 19:26:52 UTC 2009


Hi everyone.

I have successfully installed OCRopus onto a test machine and have been
testing the software using various images. The purpose of this testing is to
see the software at work and improve my understanding of the OCRopus project
and OCR in general.

I have recorded my results here:
http://wiki.fluidproject.org/display/fluid/OCRopus+0.3+Testing

For those who do not know, OCRopus is OCR software that works by command
line. You pass it an image as an argument and the output is dumped to the
screen as HTML or you can redirect it to a file.

The test images I used varied from photographed text of varying contrast and
treatment (admittedly bad photos, but I was curious to see how OCRopus
handles it), and scanned text with various layouts and font sizes. The
results are very interesting with some output being empty despite the image
being legible to the human eye (underscores the importance of proper
exposure, white balance, and contrast of the input image).

Having gone through this exercise, I wonder if there are any other
adjustments / tweaks I can do from the command line that can improve the
output? Or is the success of a "good" text conversion dependent on a clean
input image?

- Jonathan.

PS. I will be cross posting this email to the OCRopus mailing list shortly,
but with some minor adjustments.

---
Jonathan Hung / jhung.utoronto at gmail.com
Fluid Project - ATRC at University of Toronto
Tel: (416) 946-3002
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://fluidproject.org/pipermail/fluid-work/attachments/20090505/6f56233c/attachment.html>


More information about the fluid-work mailing list