githubEdit

Document Digitization

Documet Digitization

This pipeline is used to extract text from a digital/scanned document. Lines and layouts (header, footer, paragraph, table, cell, image) are detected by a custom-trained Prima layout model and OCR is done using the Anuvaad OCR model.

Github repo: Anuvaad Document Processorarrow-up-right

API contract: API Contractarrow-up-right

How to Use

  1. Upload a PDF or image file using the upload API:

    Upload URL: https://auth.anuvaad.org/anuvaad-api/file-uploader/v0/upload-filearrow-up-right

    Get the upload ID and copy it to the DD2.0 input path.

Microservices

Word Detector

  • Input: PDF or image

  • Output: List of pages with detected lines and page information.

sample

Github repo: Word Detector Craftarrow-up-right

API contract: Word Detector API Contractarrow-up-right

chevron-rightHow to use: Word Detectorhashtag
  1. Upload a PDF or image file using the upload API:

    Upload URL: https://auth.anuvaad.org/anuvaad-api/file-uploader/v0/upload-filearrow-up-right

  2. Initiate the Word Detector Workflow:

    WF URL: https://auth.anuvaad.org/anuvaad-etl/wf-manager/v1/workflow/async/initiatearrow-up-right

    Word Detector Input:

Layout Detector

  • Input: Output of word detector

  • Output: List of pages with detected layouts and lines.

sample

Github repo: Layout Detector Primaarrow-up-right

API contract: Layout Detector API Contractarrow-up-right

chevron-rightHow to use: Layout Detectorhashtag
  1. Input JSON file of the word detector as an input path.

  2. Initiate the Layout Detector Workflow:

    WF URL: https://auth.anuvaad.org/anuvaad-etl/wf-manager/v1/workflow/async/initiatearrow-up-right

    Layout Detector Input:

Block Segmenter

  • Input: Output of layout detector

  • Output: Collation of line and word at layout level.

Github repo: Block Segmenterarrow-up-right

API contract: Block Segmenter API Contractarrow-up-right

chevron-rightHow to use: Block Segmenterhashtag
  1. Input JSON file of the layout detector as an input path.

  2. Initiate the Block Segmenter Workflow:

    WF URL: https://auth.anuvaad.org/anuvaad-etl/wf-manager/v1/workflow/async/initiatearrow-up-right

    Block Segmenter Input:

  • Input: Output of block segmenter

  • Output: Text collation at word, line, and paragraph level using Google Vision as the OCR engine.

Tesseract OCR

  • Input: Output of block segmenter

  • Output: Text collation at word, line, and paragraph level using Anuvaad OCR model.

Github repo: OCR Tesseract Serverarrow-up-right

API contract: Google Vision API Contractarrow-up-right

chevron-rightHow to use: Tesseract OCRhashtag
  1. Input JSON file of the block segmenter as an input path.

  2. Initiate the Tesseract OCR Workflow:

    WF URL: https://auth.anuvaad.org/anuvaad-etl/wf-manager/v1/workflow/async/initiatearrow-up-right

    Tesseract OCR Input:

Google OCR (Tesseract Alternative)

chevron-rightHow to use: Google Vision OCRhashtag
  1. Input JSON file of the block segmenter as an input path.

Github repo: OCR Google Vision Serverarrow-up-right

API contract: Google Vision API Contractarrow-up-right

Last updated