Tesseract Open Source OCR Engine
Tesseract is a free optical character recognition engine originally developed at Hewlett-Packard and currently developed by Google. It is a raw OCR engine - it has no document layout analysis, no output formatting, and no graphical user interface. It only processes a TIFF or BMP image of a single column and creates text from it. It can detect fixed pitch vs proportional text. The engine was in the top 3 in terms of character accuracy in 1995. The source code will read a binary, grey or color image and output text.
Tesseract can process English, French, Italian, German, Spanish, Brazilian, Portuguese and Dutch and can be trained to work in other languages as well.
- Developed at Publishing
- Sources inherited from project openSUSE:Factory
-
1
derived packages
- Download package
-
Checkout Package
osc -A https://api.opensuse.org checkout home:seife:Factory/tesseract-ocr && cd $_
- Create Badge
Refresh
Refresh
Source Files
Filename | Size | Changed |
---|---|---|
tesseract-5.3.0.tar.gz | 0001913678 1.83 MB | |
tesseract-ocr.changes | 0000013057 12.8 KB | |
tesseract-ocr.spec | 0000003986 3.89 KB |
Revision 11 (latest revision is 17)
Dominique Leuenberger (dimstar_suse)
accepted
request 1057918
from
Martin Pluskal (pluskalm)
(revision 11)
- Move unversioned libraries to main package - Update to version 5.3.0: * Fix memory issues in ScrollView::MessageReceiver * autotools: Add rule for svpaint executable * Replace call of exit function by return statement in main function * Fix the build on CodeQL/Analyze by @arseniy-sonar in #3888 * CI: Remove Ubuntu 18.04 * configure.ac: fix build on aarch64_be * SW CI: Add paths filter * Create .mailmap * Fix tesseract.pc from cmake to match autotools * Update README.md * Fixed 2 errors * fix issue #3940 - remove colormap before thresholding * Update upload-artifact action * Update checkout action to version 3 * Fix Markdownlint * Fix broken links in CONTRIBUTING.md * pdfrenderer.cpp: Ignore non-text blocks * lstm.train: allow .box from .raw.png too * Fix a number of performance issues (reported by Coverity Scan) * Fix training tools for legacy engine (issue #3925) * Fix function tesseract::WriteFeature (issue #3925) * Modernize function ObjectCache::DeleteUnusedObjects (fix issue with s… * More fixes for issue #3925 - Fixed packaging to include missing shared libs: * libcommon_training.so * libunicharset_training.so
Comments 0