Python OCR

A proof of concept of a small OCR text recognition app using python + tesseract-ocr within Docker

Prerequisites

docker@^24.0.3

How to use it?

Clone the repo executing git clone git@github.com:dmmarmol/python-ocr.git
Navigate into cd python-ocr
Run the command make build (Makefile shortcut to build the docker image)
Run the command make run
Run the command make shell to attach a shell to the running container
Choose between bulk process images or process text

Process images

This command will read all images inside the images/source directory and will extract the text content from each of them putting them all together in a new file inside images/output

Steps

Deposit any .jpg or jpeg file inside images/source directory
Navigate inside the container using an attached shell and from the app/ directory, run the command python3 src/process-images.py

Process text

This command will read all .txt files inside the text/source directory and will normalize the text content from each of file putting them all together in a new file inside text/output

Steps

Deposit any .txt file inside text/source directory
Navigate inside the container using an attached shell and from the app/ directory, run the command python3 src/process-text.py

Commands

Build

Build the Dockerfile image

make build

Run

Run a docker container instance of the Dockerimage

make run

Restart docker container

make stop
make remove

Lastly, repeat build and run commands

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
src		src
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt
script.py		script.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Python OCR

Prerequisites

How to use it?

Process images

Steps

Process text

Steps

Commands

Build

Run

Restart docker container

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

dmmarmol/python-ocr

Folders and files

Latest commit

History

Repository files navigation

Python OCR

Prerequisites

How to use it?

Process images

Steps

Process text

Steps

Commands

Build

Run

Restart docker container

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages