All pages
Powered by GitBook
1 of 2

Loading...

Loading...

Sunbird Anuvaad Overview

Overview

Project Anuvaad is an open-sourced project funded by EkStep foundation.

It was bootstrapped by EkStep Foundation in late 2019 as a solution to enable easier translation of legal documents from English to Indic languages & vice-versa. Creating Anuvaad platform allowed legal entities to digitize & translate the Orders/Judgements using an easy to use user interface.

Anuvaad is an AI based open source Document Translation Platform to digitize and translate documents in Indic languages at scale. Anuvaad provides easy-to-edit capabilities on top the plug & play NMT models. Separate instances of Anuvaad are deployed to Diksha (NCERT), Supreme Court of India(SUVAS) and Supreme Court of Bangladesh (Amar Vasha).

Anuvaad leverages state of the art AI/ML models including NMT, OCR, Layout detection to provide high level of accuracy. Project Anuvaad was envisioned to be end to end open sourced solution for document translation across multiple domains.

Project Anuvaad is REST APIs driven and hence any third party system can use various features like sentence translation, layout detection etc.

NOTE: The documentation is still WIP. Feel free to contribute to it or raise issues if the desired info is not uptodate. Explore the KT videos if you would like to dive deep into each module.

Features

Anuvaad is loaded with lots of features to provide the optimal experience for the end user to smoothen the process of document translation. The notable features are highlighted below:

Document Digitization

Document digitization is the process of converting physical documents into digital formats, making them easily accessible and editable.

Layout Detection

Anuvaad is coupled with custom trained Layout detection models for Identifying and comprehending a document's structure, which involves the recognition of key elements, including headings, paragraphs, tables, and images. This process is essential not only for enhancing OCR accuracy but also for preserving the document's layout and structure in the translated version.

Document Translation

Document translation involves converting text from one language to another, facilitating cross-lingual communication and information access. Anuvaad supports using NMT models straight from Bhashini Dhruva or in-built plug and play type of models for domain specific use cases.

Document Structure Preservation

This feature ensures that the original formatting, layout, and structure of documents are maintained during the translation process, preserving the document's visual integrity.

Improve Translation from Speech

Speech to text technology converts spoken language into written text, enabling audio content to be transcribed for translation or other purposes.

Translation Memory

Translation memory stores and retrieves previously translated segments to ensure consistency across documents and reduce translation time.

Glossary Support

Glossary support provides access to defined terminology and specialised vocabulary, ensuring consistency and precision in translations, particularly in specialised fields.

Usage Analytics and Metrics

Usage analytics and metrics offer insights into how the platform is utilised, helping users track and optimise translation processes and workflows.

File Format Conversion

File format conversion simplifies the process of converting documents from one file format to another while preserving their content and structure, enhancing compatibility.

Transliteration Support

Transliteration support enables the conversion of text from one script or alphabet to another, aiding users in dealing with different writing systems and ensuring the correct pronunciation of words, especially in multilingual contexts.