Technology Stack
Technology Stack
​
Component
Details
​Apache Kafka​
Translator and OpenNMT are integrated through Kafka messaging.
​MongoDB​
Primary data storage.
​Redis​
Secondary in memory storage.
Cloud Storage
Samba storage is used to store user input files.
​NGINX​
Serve as a redirection server and also takes care of system level configs. Ngnix acts as the gateway.
​Zuul​
API Gateway to apply filters on client requests,authenticate,authorize,throttle client requests.

AI ML Assets

Component
Details
​PRIMA​
Layout detection model.
​Google Vision​
Used for OCR in Document Digitization v1.0 , v1.5. Replaced with custom trained Tesseract in latest versions.
​CRAFT​
Used for Line detection.
​Tesseract​
Custom trained Tesseract used for OCR.
​OpenNMT​
Custom trained OpenNMT used for translation.
Copy link
Edit on GitHub