All the images are uploaded on Docker hub in docker-namsel-ocr public repository.
Put all the scan images in a directory in your local computer [PATH] and run the image in the background with the latest image version:
docker run -itd --name namsel -v [PATH]:/home/namsel-ocr/data thubtenrigzin/docker-namsel-ocr:latest bash
For example : Path = ~/data and latest tag version = latest
docker run -itd --name namsel -v ~/data:/home/namsel-ocr/data thubtenrigzin/docker-namsel-ocr:latest bash
Scantaillor will prepare all the images stored in your local directory ~/data.
It is possible to add optionaly the threshold value and the double page layout by adding "l2".
docker exec namsel ./preprocess [threshold value] l2
Namsel will start the recognition based on a default configuration. It uses ~/data/out directory (the scantaillored images). The result file "ocr_output.txt" will be move to your working directory ~/data
docker exec namsel ./pecha
Namsel will start the recognition based on a default configuration. It uses ~/data/out directory (the scantaillored images). The result file "ocr_output.txt" will be move to your working directory ~/data
docker exec namsel ./book
Namsel will start the recognition based on a default configuration. It uses ~/data/out directory (the scantaillored images). The result file "ocr_output.txt" will be move to your working directory ~/data
docker exec namsel ./namsel-ocr [parameter1 parameter2 etc...]
An all in one button for the book and pecha recognition.
The threshold preprocess value can be optionaly add as a parameter and the double page layout by adding "l2".
For the book recognition:
docker exec namsel ./1book [threshold value] l2
And for the Pecha recognition:
docker exec namsel ./1pecha [threshold value]
Please refer to namsel-ocr repository on Github for the source of the base built Docker image.
All the Docker source will take place on docker-namsel-ocr repository on Github.
- add the preprocess argument "l2" for the double layout book
- check if file or directory exist before the deleting or moving actions
- stability improvements
- The script ./namsel-ocr doesn't delete the directory ./data/out after the ocr completion
- delete the directory "out" after the recognition completion
- test if the "out" directory exists and uses the non-scantailored scan image if the preprocess has not been launched before the recognition
- use the tag "latest" for the basic image
- add an "all in one button" for the book and pecha recognition, including the preprocessing stage
- add the threshold parameter as an option for the preprocess
- correcting an issue in book script file letting the book recognition work properly
First release of the project