18071371. LANGUAGE-AGNOSTIC OCR EXTRACTION simplified abstract (Microsoft Technology Licensing, LLC)

From WikiPatents
Jump to navigation Jump to search

LANGUAGE-AGNOSTIC OCR EXTRACTION

Organization Name

Microsoft Technology Licensing, LLC

Inventor(s)

Osaid Rehman Nasir of New Delhi (IN)

Bharat Kumar Jain of Hyderabad (IN)

Smitkumar Narotambhai Marvaniya of Bangalore (IN)

LANGUAGE-AGNOSTIC OCR EXTRACTION - A simplified explanation of the abstract

This abstract first appeared for US patent application 18071371 titled 'LANGUAGE-AGNOSTIC OCR EXTRACTION

Simplified Explanation

The abstract describes a technology for language agnostic OCR extraction, which involves identifying word regions in images, applying a language agnostic machine learning model, searching a multilingual index, and outputting text or text embeddings to downstream processes.

  • Identifying word regions in images using OCR
  • Applying a language agnostic machine learning model trained on image-text pairs and multilingual text translation pairs
  • Searching a multilingual index for text embeddings matching word region embeddings
  • Outputting text or text embeddings to downstream processes

Potential Applications

This technology can be applied in various fields such as document processing, image recognition, and multilingual text analysis.

Problems Solved

This technology solves the problem of extracting text from images in a language agnostic manner, making it easier to process multilingual content.

Benefits

The technology offers improved accuracy in OCR extraction, especially for multilingual documents, and enhances the efficiency of text extraction processes.

Potential Commercial Applications

Potential commercial applications include document management systems, translation services, and image processing software.

Possible Prior Art

Prior art in this field may include existing OCR technologies, machine learning models for text recognition, and multilingual text processing systems.

Unanswered Questions

How does this technology handle handwritten text recognition?

This technology focuses on printed text recognition and may not be optimized for handwritten text. Handwritten text recognition would require additional training data and model adjustments.

Can this technology be integrated with existing OCR software?

Integrating this technology with existing OCR software may require compatibility testing and adjustments to ensure seamless operation.


Original Abstract Submitted

Technologies for language agnostic OCR extraction include identifying a word region of an image using optical character recognition, applying a language agnostic machine learning model to the word region, where the language agnostic machine learning model is trained on training data including a set of image-text pairs and a set of multilingual text translation pairs, receiving, from the language agnostic machine learning model, a word region embedding that is associated with the word region, searching a multilingual index for a text embedding that matches the word region embedding, receiving, from the multilingual index, text associated with the text embedding; and outputting at least one of the text or the text embedding to at least one downstream process, application, system, component, or network.