Languages

Contents

Tesseract OCR

The supported languages by the Tesseract OCR engine are:

AfrikaansGreek (Ancient -1453)Polish
AlbanianGreek (Modern 1453-)Portuguese
AmharicGujaratiPushto, Pashto
ArabicHaitian, Haitian CreoleRomanian, Moldavian, Moldovan
AssameseHebrewRussian
AzerbaijaniHindiSanskrit
Azerbaijani (Cyrillic)HungarianSerbian
BasquaeIcelandicSerbian (Latin)
BelarusianIndonesianSinhala, Sinhalese
BengaliInuktitutSlovak
BosnianIrishSlovenian
BulgarianItalianSpanish, Castillan
BurmeseItalian (Old)Spanish, Castilian (Old)
Catalan, ValencianJapaneseSwahili
CebuanoJavaneseSwedish
Central KhmerKannadaSyriac
CherokeeKazakhTagalog
Chinese (Simplified)Kirghiz, KyrgyzTajik
Chinese (Traditional)KoreanTamil
CroatianKurdishTelugu
CzechLaoThai
DanishLatinTibetan
Dutch, FlemishLatvianTigrinya
DzongkhaLithuanianTurkish
EnglishMacedonianUighur, Uyghur
EsperantoMelayUkrainian
EstonianMelayalamUrdu
FinnishMalteseUzbek
FrankishMarathiUzbek (Cyrillic)
FrenchNepaliVietnamese
GalicianNorwegianWelsh
GeorgianOriyaWelsh
Georgian (Old)Panjabi, PunjabiYiddish
GermanPersian