OCR Support

Optical Character Recognition is supported as of AvantFAX 3.0.2 and is provided by Tesseract which must be installed separately. OCR usage in AvantFAX will allow your users to effectively search the Archive with words found in the fax itself.

Here is a short guide for the insallation and configuration with AvantFAX.

First download the tesseract package and any desired language support files. AvantFAX has been tested with Tesseract 2.01.

To install, run:

tar xvfz tesseract-2.01.tar.gz
for i in `ls tesseract-2.00.*.tar.gz`; do tar -xvz -C tesseract-2.01/ -f $i; done
cd tesseract-2.01
./configure && make

As root, run:

make install

Now, you must enable OCR support in AvantFAX. In includes/local_config.php, set ENABLE_OCR_SUPPORT to true.

define ('ENABLE_OCR_SUPPORT', true);

To process the faxes already in your AvantFAX archive you can use the ocr_import script in the tools directory.

./ocr_import.php

Finally, to verify that OCR support is working correctly, try to search for keywords, names, or product codes in your faxes using the AvantFAX Archive.



t38fax