Machine Learning Based Document Visual Element Extraction

Organization Name

GOOGLE LLC

Inventor(s)

Nikolay Glushnev of Woodinville WA (US)

Qingze Wang of San Jose CA (US)

Emmanouil Koukoumidis of Kirkland WA (US)

Henry Wahyudi Setiawan of Bellevue WA (US)

Lauro Ivo Beltrao Colaco Costa of Kirkland WA (US)

Vincent Perot of Brooklyn NY (US)

Machine Learning Based Document Visual Element Extraction - A simplified explanation of the abstract

This abstract first appeared for US patent application 17808293 titled 'Machine Learning Based Document Visual Element Extraction

Simplified Explanation

Abstract Explanation

The patent application describes a method that involves analyzing a document containing both text and visual elements. The method uses machine learning models to determine the location of each textual field and the visual element within the document. It then assigns a visual element anchor token to the visual element and inserts it into the textual fields based on its location and the location of the textual fields. After inserting the visual element anchor token, the method extracts structured entities representing the textual fields and the visual element using a text-based extraction model.

The method analyzes a document with both text and visual elements.
It determines the location of each textual field and the visual element within the document.
It assigns a visual element anchor token to the visual element.
The visual element anchor token is inserted into the textual fields based on their respective locations.
The method extracts structured entities representing the textual fields and the visual element using a text-based extraction model.

Potential Applications

Document analysis and organization
Data extraction from documents with mixed text and visual elements
Content management systems
Information retrieval and indexing

Problems Solved

Efficiently analyzing documents with both text and visual elements
Accurately determining the location of textual fields and visual elements within a document
Extracting structured entities from documents with mixed content

Benefits

Improved document analysis and organization
Enhanced data extraction capabilities
Streamlined content management processes
More efficient information retrieval and indexing

Original Abstract Submitted

A method includes obtaining a document with textual fields and a visual element. For each textual field, the method includes determining a textual offset for the textual field that indicates a location of the textual field relative to each other textual field in the document. The method includes detecting, using a machine learning vision model, the visual element and determining a visual element offset indicating a location of the visual element relative to each textual field in the document. The method includes assigning the visual element a visual element anchor token and inserting the visual element anchor token into the textual fields in an order based on the visual element offset and the respective textual offsets. The method also includes, after inserting the visual element anchor token, extracting, using a text-based extraction model, from the textual fields, structured entities representing the series of textual fields and the visual element.

17808293. Machine Learning Based Document Visual Element Extraction simplified abstract (GOOGLE LLC)

Contents

Machine Learning Based Document Visual Element Extraction

Organization Name

Inventor(s)

Machine Learning Based Document Visual Element Extraction - A simplified explanation of the abstract

Simplified Explanation

Abstract Explanation

Potential Applications

Problems Solved

Benefits

Original Abstract Submitted

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools