Word (rtf)

Contents

Word (rtf) output format coverts the current document into a Microsoft Rich Text Format document.

ParameterValue
Output ExtensionRTF
OCRAlways (OmniPage & Abbyy)
Multipage supportYes
output_profiles_rtf1

Engine
Select here the OCR engine to use to run the current task and create the output document. Available engines are, based on the current license:

  • Nuance OmniPage
  • Abbyy FineReader

Select the language to use during the OCR recognition process. Multiple languages can be selected by holding CTRL key while selecting the languages.

Please refer to the OCR Appendix chapter for the supported OCR languages.

Timeout
Specify a maximum amount of time the OCR process should run, after which a timeout will occur terminating the process with an error. To be used to prevent the OCR process might take too long, hang or loop on particular complex or malformed documents.

The timeout value is expressed in seconds.

OmniPage

When selecting the Nuance OmniPage engine additional format options will be displayed.

Use frames
If enabled Microsoft Word frames will be used to group paragraph and compose the document layout. The option allows to have a more precise layout by using the more accurate position given by the frames, however text editable will be less easy inside the frames.

Abbyy

When selecting the Abbyy FineReader engine additional format options will be displayed.

Page synthesis
Specifies the mode of output of the file synthesis from the recognized text inside the Word document. Available options are:

  • Plain text: the text in the output Word is formatted in a single column. Frames are not used. Paragraphs are retained while types and sizes of fonts are not retained.
  • Format paragraphs: paragraphs and fonts types and sizes are retained. The text formatting inside paragraph is not retained.
  • Editable copy: produces a document that preserves the original format and text flow but allows easy editing. The page breaks are not guaranteed to be preserved in this mode. If that’s important please consider Exact copy.
  • Exact copy: produces a document that maintains the formatting of the original. This option is recommended for documents with complex layouts such as promotion booklets. However this option may limit the ability to change the text and formatting of the output document.

Keep pictures
If enabled original pictures will be retained during the export of the recognized text to Word.

Keep text color
If enabled original colors of text are retained during export of the recognized text to Word.

Enhance local contrast
If enabled engine will increase the local contrast of the image during the preprocessing of the image. Such option may increase the quality of recognition.

Info

The option is meaningful for color and gray images only.

The images for which this preprocessing method is effective include:

  • Photos or scans of documents with texture or pictures in the background. With the normal binarization procedure, the characters that coincide with darker areas of background may be lost or recognized unreliably. If you apply this method before recognition, such areas are detected, and contrast is increased, with the result that after binarization the characters stand out more distinctly.
  • Photos or scans of documents with highly colorful background or text highlighting.

Remove noise
If enabled engine will reduce the noise of the image. Available working options are:

  • White noise: this mode may be useful, for example, for uncompressed images with ISO less then 800, for reduced images.
  • Correlated noise: this mode may be useful, for example, for the JPEG photos with high compression settings
Info

The option can be used only for color and 8-bit gray images.

Title, Author, Subject, Keywords
Enter a text to set as Title, Author, Subject or Keywords property of the Microsoft Office document, or click the Variable button to select a variable which will contain a value for the target properties.

Previous Article

Word (docx)

Next Article

XPS