Quantcast
Channel: Walter Lee – Ephesoft WIKI
Viewing all articles
Browse latest Browse all 166

How to OCR a JPEG?

$
0
0

Issue:

When using the Mail import feature in Ephesoft there may be times that a client will send a document in JPG/JPEG Format. If the file is in the accepted mail import file extension list, the file will be converted in to a PDF using Open-Office.

The PDF file generated from the JPEG, will not be OCR’d correctly and the users will be forced to Manually enter data during Validation.

Solution:

Here are ways we could successfully OCR a JPEG Converted to PDF:

Color Switch OFF:

At present, the Color Switch in Recostar Page Process plugin in the shared batch class is ON. We switched OFF the color switch to find out that we were successfully able to OCR the image and get the desired content.

This tells us that we are able to OCR the tif image but not the png image(In case of color switch ON, we OCR the png files)

We investigated the Create OCR Input plugin to find out why OCRing could not be performed on the png file, generated when the Color switch was ON.

That led us to the 2nd way of OCRing the input image.

Color Switch ON, IgnorePalette= false, OutputParameters empty:

We were successfully able to OCR the pdf with Color Switch ON by:

· Turning the IgnorePalette property in the Fpr.rsp file to false.(instead of the value ‘true’ that it had in the Fpr.rsp file of the shared batch class)

· Removing the value of “imagemagick.display_image_output_parameters” property from the imagemagick.properties file. The property value “-colorspace gray -alpha off” did not work.

We have tested this on our end and was able to properly Select the right values for your fields.

jpg

 


Viewing all articles
Browse latest Browse all 166

Latest Images

Trending Articles





Latest Images