Excel (xlsx)

Contents

Excel (xlsx) output format coverts the current document into a Microsoft Excel Spreadsheet format, specifically XLSX version.

ParameterValue
Output ExtensionXLSX
OCRAlways (OmniPage & Abbyy)
Multipage supportYes
output_profiles_xlsx1

Engine
Select here the OCR engine to use to run the current task and create the output document. Available engines are, based on the current license:

  • Nuance OmniPage
  • Abbyy FineReader

Select the language to use during the OCR recognition process. Multiple languages can be selected by holding CTRL key while selecting the languages.

Please refer to the OCR Appendix chapter for the supported OCR languages.

Timeout
Specify a maximum amount of time the OCR process should run, after which a timeout will occur terminating the process with an error. To be used to prevent the OCR process might take too long, hang or loop on particular complex or malformed documents.

The timeout value is expressed in seconds.

Abbyy

When selecting the Abbyy FineReader engine additional format options will be displayed.

Convert strings to numbers
If enabled numerical values in recognized text are exported to Excel as numbers rather than as strings.

Keep text color
If enabled original colors of text are retained during export of the recognized text to Excel.

Enhance local contrast
If enabled engine will increase the local contrast of the image during the preprocessing of the image. Such option may increase the quality of recognition.

Info

The option is meaningful for color and gray images only.

The images for which this preprocessing method is effective include:

  • Photos or scans of documents with texture or pictures in the background. With the normal binarization procedure, the characters that coincide with darker areas of background may be lost or recognized unreliably. If you apply this method before recognition, such areas are detected, and contrast is increased, with the result that after binarization the characters stand out more distinctly.
  • Photos or scans of documents with highly colorful background or text highlighting.

Remove noise
If enabled engine will reduce the noise of the image. Available working options are:

  • White noise: this mode may be useful, for example, for uncompressed images with ISO less then 800, for reduced images.
  • Correlated noise: this mode may be useful, for example, for the JPEG photos with high compression settings
Info

The option can be used only for color and 8-bit gray images.

Layout retention
The option allows you to set the mode of retaining the original document tables layout while exporting to the Excel format. Available options are:

  • Default: no specific settings are applied.
  • Exact document: optimized for recreating the appearance of the original document, but does not necessarily retain the exact cells and lines of the original tables.
  • Exact lines: optimized for recreating the exact division into cells / lines of the original document tables.

Title, Author, Subject, Keywords
Enter a text to set as Title, Author, Subject or Keywords property of the Microsoft Office document, or click the Variable button to select a variable which will contain a value for the target properties.

Previous Article

Excel (xls)

Next Article

GIF