Smart Document

Info

This module is licensed on the number of pages to be processed.

Smart Document allows intelligent advanced text operations and manipulation without using neither fixed zones or preconfigured rules.

The recognition is done by using the Microsoft AI recognition engine and all the output results are elaborated and returned as variables and / or modified document.

Smart Document offers different AI text analysis features such as:

One or more features together can be enabled and executed in the one Smart Document workflow module instance.

Warning

If more features are used, either in the same Smart Document module or in different separate modules, pages are counted for each feature!

Warning

The Smart Document module requires internet connection to contact the Microsoft AI recognition engine.

Read about the Data, privacy and security of the Microsoft Azure AI engines.

When processing multipage documents features exporting variables will extract the output variables on every page and the variable name will be suffixed with _PY which stands for Page Y where Y is the number of the page, example:

SMARTDOCUMENT_FEATURE1_P1
SMARTDOCUMENT_FEATURE2_P2

The base variable name without the _PY will contain the result on the last page of the document.

The module creates custom variables during processing. Check the Variables list for more details.

The left hand side menu shows the available settings section. Settings are displayed according the selected section.

Classification

Classification analyzes the documents and classifies them by extracting main concepts in the full document text. The concepts are returned in the form of keywords or key phrases with several main keywords.

For example, in the text “The food was delicious and the staff were wonderful.“, Classification might return the main topics: “food” and “wonderful staff“.

Enable
If enabled it will run the Classification task in the current Smart Document module.

Language
Select the language of the source document, by default it is automatically detected. Check the list below for the full languages support.

Variable returned
Insert the name of the variable which will contain a comma separated list of the computed keywords and key phrases.

Example:

%SMARTDOCUMENT_CLASSIFICATION_RESULT% = food,wonderful staff

Personal Information

Personal Information analyzes the documents and identifies and classifies sensitive information inside the document text. The identified Personal Information are returned, together with the identified category, inside a list of variables.

For example: phone numbers, email addresses, and forms of identification.

Enable
If enabled it will run the Personal Information task in the current Smart Document module.

Language
Select the language of the source document, by default it is automatically detected. Check the list below for the full languages support.

Variable returned
Insert the name of the variable which will contain the identified information. The variable is used as prefix of a list of variables for each information identified, with a unique incrementing identified. For each information identified an additional variable with the category and the confidence is created as well.

Example:

%SMARTDOCUMENT_PII_RESULT_X% = John Smith
%SMARTDOCUMENT_PII_RESULT_X_CATEGORY% = First name
%SMARTDOCUMENT_PII_RESULT_X_CONFIDENCE% = 99

Where X is the incrementing number of the information found.

Summarization

Summarization analyzes the documents and generates a summary for them. The output summary is exported as either text inside a destination variable or as a document in the destination selected format.

Enable
If enabled it will run the Summarization task in the current Smart Document module.

Language
Select the language of the source document, by default it is automatically detected. Check the list below for the full languages support.

Type
Select the type of summarization to execute. Available types are:

  • Text (create a full open piece of text)
  • Bullets (create a certain number of single bullet sentences)

Amount
If a summarization of type Bullets is generate this value indicates the number of bullet sentences to generate.

Variable returned
Insert the name of the variable which will contain the output piece of summary as text, either full text or bullets senteces.

Path(s)
Enter an output path manually, or select Browse to browse to the right folder. You may also insert variables by selecting the Variable button on the right, so you would use e.g.

C:\Output\%SUBFOLDER%\

to store the summary text file to a subfolder based on a variable Of course in order to do so, you must first set up a question which lets the user choose the root folder.

Info

For network paths specify the full UNC path starting with \\. Mapped drives are not allowed because Scanshare is running inside services and Mapped Drives are not existing in services context.

Info

For the network authentication and allowing authentication on virtual UNC shares you would use \# before your UNC path. e.g.: \#YOUR_UNC_PATH.

Info

Multiple paths can be inserted separated by ; (semicolon). If done the output summary text file will be created multiple times in all the paths specified, with the same filename.

Filename
Enter the filename for the output summary text file, or click on the Variables button on the right, to select a variable which will contain the filename.

Info

The filename field here allows the use of the UNIQUECOUNTER variable. Please refer to the Variables appendix for further information on the variable use.

Info

If Path(s) and/or Filename are left empty the output summary text will only returned inside the variable and no output document will be created.

The output summary text file is a rasterized PDF document.

Username and Password
Enter a username and password (if needed) or select a variable which will contain the username / password, to access the restricted network output folder.

Warning

Authentication is not required on local folders (e.g. starting with a local drive letter). If you specify credentials in such case an error will be generated during storing because Scanshare will attempt to obtain the UNC root authentication point.

Translation

Translation is a feature which automatically translates the input document into a destination target language, retaining the original source document format and layout. The translation is done directly on the current processed document and exported as text as well inside a module variable.

Warning

When using Translation the current processed document is modified with the translated one, every module after the Smart Document will work on the returned translated copy of the document.

If you need to keep the original document as well inside the Workflow split the process route before the Smart Document module.

Enable
If enabled it will run the Translation task in the current Smart Document module.

Language
Select the destination language in which the source document needs to be translated. Check the list below for the full languages support.

Engine

Info

Use the Engine section only when communicated by your reseller with the right data to insert. Wrong configuration in this section may make the Smart Document not working.

Service Key
Enter the AI engine Service Key, or click on the Variables button on the right to select a variable which will contain the key.

FormName
Enter the AI engine form name to recognition, or click on the Variables button on the right to select a variable which will contain the name.

Languages

Each Smart Document AI feature supports different languages to work with. Please check below the matrix table of each feature which language supports.

ClassificationPersonal InformationSummarization
Afrikaansxx
Albanianxx
Amharicxx
Arabicxxx
Armenianxx
Assamesexx
Azerbaijanixx
Basquexx
Belarusianx
Bengalixx
Bosnianxx
Bretonx
Bulgarianxx
Burmesexx
Catalanxx
Chinese-Simplifiedxxx
Chinese-Traditionalxx
Croatianxx
Czechxx
Danishxx
Dutchxx
Englishxxx
Esperantox
Estonianxx
Filipinox
Finnishxx
Frenchxxx
Galicianxx
Georgianx
Germanxxx
Greekxx
Gujaratixx
Hausax
Hebrewxxx
Hindixx
Hungarianxx
Indonesianxx
Irishxx
Italianxxx
Japanesexxx
Javanesexx
Kannadaxx
Kazakhxx
Khmerxx
Koreanxxx
Kurdish (Kurmanji)xx
Kyrgyzxx
Laoxx
Latinx
Latvianxx
Lithuanianxx
Macedonianxx
Malagasyxx
Malayxx
Malayalamxx
Marathixx
Mongolianxx
Nepalixx
Norwegian (Bokmål)xx
Odiaxx
Oromox
Pashtoxx
Persianxx
Polishxxx
Portuguese (Brazil)xxx
Portuguese (Portugal)xx
Punjabixx
Romanianxx
Russianxx
Sanskritx
Scottish Gaelicx
Serbianxx
Sindhix
Sinhalax
Slovakxx
Slovenianxx
Somalixx
Spanishxxx
Sudanesex
Swahilixx
Swazix
Swedishxx
Tamilxx
Teluguxx
Thaixx
Turkishxx
Ukrainianxx
Urduxx
Uyghurxx
Uzbekxx
Vietnamesexx
Welshxx
Western Frisianx
Xhosax
Yiddishx

The Translation feature may have different languages support between the source document language and the selected destination (to be translated) language.

Source languageDestination language
Afrikaansxx
Albanesexx
Amharicox
araboxx
Armenox
Assamesex
Azerbaigiano (alfabeto latino)xx
Bengalesex
Baschiroxx
Bascoxx
Bosniaco (alfabeto latino)xx
Bulgaroxx
Cantonese (tradizionale)xx
Catalanoxx
Cinese (letterario)xx
Cinese Semplificatoxx
Cinese tradizionalexx
Croatoxx
Cecoxx
Danesexx
Darix
Divehix
Olandesexx
Inglesexx
Estonexx
Faeroesexx
Figianoxx
Pilipinoxx
Finlandesexx
Francesexx
Francese (Canada)xx
Galizianoxx
Georgianox
Tedescoxx
Grecox
Gujaratix
Creolo haitianoxx
Ebraicox
Hindixx
Hmong Daw (latino)xx
Ungheresexx
Islandesexx
Indonesianoxx
Interlinguaxx
Inuinnaqtunxx
Inuktitutx
Inuktitut (alfabeto latino)xx
Irlandesexx
Italianoxx
Giapponesexx
Kannadaxx
Kazakh (cirillico)xx
Kazako (alfabeto latino)xx
Khmerx
Klingonx
Klingon (plqaD)x
Coreanoxx
Curdo (arabo) (centrale)x
Curdo (latino) (settentrionale)xx
Kirghizistan (cirillico)xx
Laox
Lettonexx
Lituanoxx
Macedonexx
Malgascioxx
Malese (alfabeto latino)xx
Malayalamxx
Maltesexx
Maorixx
Marathixx
Mongolo (cirillico)xx
Mongolo (tradizionale)x
Myanmar (Birmano)x
Nepalesexx
Norvegesexx
Odiax
Pashtox
Persianox
Polaccoxx
Portoghese (Brasile)xx
Portoghese (Portogallo)xx
Punjabixx
Queretaro Otomixx
Romenoxx
Russoxx
Samoa (alfabeto latino)xx
Serbo (alfabeto cirillico)xx
Serbo (alfabeto latino)xx
Slovaccoxx
Slovenoxx
Somaloxx
Spagnoloxx
Swahili (latino)xx
Svedesexx
Tahitianoxx
Tamilxx
Tartaro (alfabeto latino)xx
Teluguxx
Thaix
Tibetanox
Tigrinox
Tonganoxx
Turcoxx
Turkmeno (alfabeto latino)xx
Ucrainoxx
Alto soraboxx
Urdux
Uyghur (arabo)x
Uzbeco (alfabeto latino)xx
Vietnamitaxx
Gallesexx
Yucatechi Mayaxx
Zulùxx
Previous Article

Report Generator

Next Article

Smart Form