Loading...
Loading...
Loading...
This page is dedicated to simple benchmarking of various tesseract version and options.
View source# Benchmarks
This page is dedicated to simple benchmarking of various tesseract version and options.
As input image for testing is used image from [issue 236](https://github.com/tesseract-ocr/tesseract/issues/263).
## Results
| Build | [tessdata_best ](https://github.com/tesseract-ocr/tessdata_best)| [tessdata_fast](https://github.com/tesseract-ocr/tessdata_fast) | [tessdata ](https://github.com/tesseract-ocr/tessdata)|
| --- | ---: | ---: | ---: |
| 305 | - | - | 2.4713 |
| 413noavx | 37.6052 | 5.1589 | 10.1519 |
| 413avx | 12.7300 | 2.9538 | 4.0860 |
| 501 | 6.1981 | 2.1241 | 2.9107 |
| 501ap | 6.1369 | 2.1254 | 2.9221 |
| 501openmp | 3.4590 | 1.9612 | 2.3554 |
## Information about testing environment
* Windows 10 64bit
* compiler: VS 2019
* CPU: Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz 6 cores
* memory: 16GB RAM
Tested with python code
```py
import timeit
import time
import os
import pytesseract
start_time = time.time()
tess_exe = r"msvc.v5.openmp\tesseract.exe"
test_image = r"i263_speed.jpg"
os.environ['TESSDATA_PREFIX'] = r"tessdata_best\tessdata"
code_to_test = """
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"{}"
pytesseract.pytesseract.image_to_string(r"{}", lang = 'eng')
"""
runs = 15
elapsed_time = timeit.timeit(code_to_test.format(tess_exe, test_image), number=runs)/runs
print("\nDuration:", elapsed_time)
```
## Tesseract build info
information provided by `tesseract -v`
### 3.05
#### 305
It uses the legacy engine.
```sh
tesseract 3.05.02
leptonica-1.83.0 (Dec 17 2021, 17:33:37) [MSC v.1929 LIB Release x64]
libgif 5.2.1 : libjpeg 6b (libjpeg-turbo 2.0.91) : libpng 1.6.37 : libtiff 4.3.0 : zlib 1.2.11 : libwebp 1.2.0 : libopenjp2 2.4.0
```
### 4.1
#### 413noavx
Build without AVX2/AVX/SSE4 support
```sh
tesseract 4.1.3
leptonica-1.83.0 (Dec 17 2021, 17:33:37) [MSC v.1929 LIB Release x64]
libgif 5.2.1 : libjpeg 6b (libjpeg-turbo 2.0.91) : libpng 1.6.37 : libtiff 4.3.0 : zlib 1.2.11 : libwebp 1.2.0 : libopenjp2 2.4.0
Found libarchive 3.5.1 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.6 libzstd/1.4.9
```
#### 413avx
Build with AVX2/AVX/SSE4 support
```sh
tesseract 4.1.3-1-ge9986
leptonica-1.83.0 (Jan 26 2022, 19:15:03) [MSC v.1929 LIB Release x64]
libgif 5.2.1 : libjpeg 6b (libjpeg-turbo 2.0.91) : libpng 1.6.37 : libtiff 4.3.0 : zlib 1.2.11
Found AVX2
Found AVX
Found FMA
Found SSE
Found libarchive 3.5.1 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.6 libzstd/1.4.9
```
### 5.0
#### 501
AVX2 supported
```sh
tesseract 5.0.1
leptonica-1.83.0 (Dec 17 2021, 17:33:37) [MSC v.1929 LIB Release x64]
libgif 5.2.1 : libjpeg 6b (libjpeg-turbo 2.0.91) : libpng 1.6.37 : libtiff 4.3.0 : zlib 1.2.11 : libwebp 1.2.0 : libopenjp2 2.4.0
Found AVX2
Found AVX
Found FMA
Found SSE4.1
Found libarchive 3.5.1 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.6 libzstd/1.4.9
Found libcurl/7.75.0 zlib/1.2.11 libssh2/1.10.1_DEV
```
#### 501ap
build with: `cmake -E env CXXFLAGS="/Qpar /fp:fast" cmake ..`
```sh
tesseract 5.0.1
leptonica-1.83.0 (Dec 17 2021, 17:33:37) [MSC v.1929 LIB Release x64]
libgif 5.2.1 : libjpeg 6b (libjpeg-turbo 2.0.91) : libpng 1.6.37 : libtiff 4.3.0 : zlib 1.2.11 : libwebp 1.2.0 : libopenjp2 2.4.0
Found AVX2
Found AVX
Found FMA
Found SSE4.1
Found libarchive 3.5.1 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.6 libzstd/1.4.9
Found libcurl/7.75.0 zlib/1.2.11 libssh2/1.10.1_DEV
```
#### 501openmp
The OpenMP build is known to waste a lot of CPU time.
As several users report problems, it is disabled by default in version 5.0.1 and later releases.
For other versions (>= 4.x) it is suggested to use the environment variable `OMP_THREAD_LIMIT=1`.
Input from OpenMP experts would be appreciated.
```sh
tesseract 5.0.1
leptonica-1.83.0 (Dec 17 2021, 17:33:37) [MSC v.1929 LIB Release x64]
libgif 5.2.1 : libjpeg 6b (libjpeg-turbo 2.0.91) : libpng 1.6.37 : libtiff 4.3.0 : zlib 1.2.11 : libwebp 1.2.0 : libopenjp2 2.4.0
Found AVX2
Found AVX
Found FMA
Found SSE4.1
Found OpenMP 2019
Found libarchive 3.5.1 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.6 libzstd/1.4.9
Found libcurl/7.75.0 zlib/1.2.11 libssh2/1.10.1_DEV
```
[← Back to docs](README.md)
title: 'LabelFusion: Learning to Fuse LLMs and Transformer Classifiers for Robust Text Classification'
+ [Learning and practice of high performance computing](https://github.com/cjmcv/hpc)
title: Ruby 2.7 changes