Commands like this
sudo apt-get install python3
must be executed in a terminal.
Currently we only support Ubuntu 18.04 and 18.10.
You need to have Python 3.6 installed:
sudo apt-get install python3 python3-virtualenv
We set up a dedicated environment to install Python packages to:
python3 -m virtualenv ~/env-ocrd
source ~/env-ocrd/bin/activate
pip install ocrd
Verify it has been installed by calling the ocrd
command line tool:
ocrd --help
Output should be similar to
Usage: ocrd [OPTIONS] COMMAND [ARGS]...
CLI to OCR-D
Options:
--version Show the version and exit.
-l, --log-level [OFF|ERROR|WARN|INFO|DEBUG|TRACE]
Log level
--help Show this message and exit.
Commands:
bashlib Work with bash library
ocrd-tool Work with ocrd-tool.json JSON_FILE
process Process a series of tasks
workspace Working with workspace
zip Bag/Spill/Validate OCRD-ZIP bags
Functionality is encapsulated in module projects which can be individually installed and combined.
We recommend the following projects to get started:
- ocrd_ocropy
- ocrd_kraken
- ocrd_tesserocr
pip install ocrd_ocropy
Verify:
ocrd-ocropy-segment --help
pip install ocrd_ocropy
Verify:
ocrd-kraken-binarize --help
ocrd_tesserocr requires tesseract to be installed in addition to the module project:
sudo apt-get install libtesseract-dev tesseract-ocr
pip install ocrd_tesseroccr
Verify:
ocrd-tesserocr-recognize --help