Document Digitization
Last updated
Last updated
This pipeline is used to extract text from a digital/scanned document. Lines and layouts (header, footer, paragraph, table, cell, image) are detected by a custom-trained Prima layout model and OCR is done using the Anuvaad OCR model.
Github repo:
API contract:
Upload a PDF or image file using the upload API:
Get the upload ID and copy it to the DD2.0 input path.
Initiate the Workflow:
DD2.0 Input:
Input: PDF or image
Output: List of pages with detected lines and page information.
Input: Output of word detector
Output: List of pages with detected layouts and lines.
Input: Output of layout detector
Output: Collation of line and word at layout level.
Input: Output of block segmenter
Output: Text collation at word, line, and paragraph level using Google Vision as the OCR engine.
Input: Output of block segmenter
Output: Text collation at word, line, and paragraph level using Anuvaad OCR model.
Upload URL:
WF URL:
Github repo:
API contract:
Upload URL:
WF URL:
Github repo:
API contract:
WF URL:
Github repo:
API contract:
WF URL:
Github repo:
API contract:
WF URL:
WF URL:
Github repo:
API contract: