Document Digitization
This pipeline is used to extract text from a digital/scanned document. Lines and layouts (header, footer, paragraph, table, cell, image) are detected by a custom-trained Prima layout model and OCR is done using the Anuvaad OCR model.
Github repo: Anuvaad Document Processor
API contract: API Contract
How to Use
Upload a PDF or image file using the upload API:
Upload URL: https://auth.anuvaad.org/anuvaad-api/file-uploader/v0/upload-file
Get the upload ID and copy it to the DD2.0 input path.
Initiate the Workflow:
WF URL: https://auth.anuvaad.org/anuvaad-etl/wf-manager/v1/workflow/async/initiate
DD2.0 Input:
Microservices
Word Detector
Input: PDF or image
Output: List of pages with detected lines and page information.
Github repo: Word Detector Craft
API contract: Word Detector API Contract
How to use: Word Detector
Upload a PDF or image file using the upload API:
Upload URL: https://auth.anuvaad.org/anuvaad-api/file-uploader/v0/upload-file
Initiate the Word Detector Workflow:
WF URL: https://auth.anuvaad.org/anuvaad-etl/wf-manager/v1/workflow/async/initiate
Word Detector Input:
Layout Detector
Input: Output of word detector
Output: List of pages with detected layouts and lines.
Github repo: Layout Detector Prima
API contract: Layout Detector API Contract
How to use: Layout Detector
Input JSON file of the word detector as an input path.
Initiate the Layout Detector Workflow:
WF URL: https://auth.anuvaad.org/anuvaad-etl/wf-manager/v1/workflow/async/initiate
Layout Detector Input:
Block Segmenter
Input: Output of layout detector
Output: Collation of line and word at layout level.
Github repo: Block Segmenter
API contract: Block Segmenter API Contract
How to use: Block Segmenter
Input JSON file of the layout detector as an input path.
Initiate the Block Segmenter Workflow:
WF URL: https://auth.anuvaad.org/anuvaad-etl/wf-manager/v1/workflow/async/initiate
Block Segmenter Input:
Input: Output of block segmenter
Output: Text collation at word, line, and paragraph level using Google Vision as the OCR engine.
Tesseract OCR
Input: Output of block segmenter
Output: Text collation at word, line, and paragraph level using Anuvaad OCR model.
Github repo: OCR Tesseract Server
API contract: Google Vision API Contract
How to use: Tesseract OCR
Input JSON file of the block segmenter as an input path.
Initiate the Tesseract OCR Workflow:
WF URL: https://auth.anuvaad.org/anuvaad-etl/wf-manager/v1/workflow/async/initiate
Tesseract OCR Input:
Google OCR (Tesseract Alternative)
How to use: Google Vision OCR
Input JSON file of the block segmenter as an input path.
Initiate the Google OCR Workflow:
WF URL: https://auth.anuvaad.org/anuvaad-etl/wf-manager/v1/workflow/async/initiate
Google OCR Input:
Github repo: OCR Google Vision Server
API contract: Google Vision API Contract
Last updated