18335845. PREDICTING MISSING ENTITY IDENTITIES IN IMAGE-TYPE DOCUMENTS (Oracle International Corporation)

From WikiPatents
Revision as of 07:31, 19 December 2024 by Wikipatents (talk | contribs) (Creating a new page)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

PREDICTING MISSING ENTITY IDENTITIES IN IMAGE-TYPE DOCUMENTS

Organization Name

Oracle International Corporation

Inventor(s)

Vikram Majjiga Reddy of Cupertino CA (US)

PREDICTING MISSING ENTITY IDENTITIES IN IMAGE-TYPE DOCUMENTS

This abstract first appeared for US patent application 18335845 titled 'PREDICTING MISSING ENTITY IDENTITIES IN IMAGE-TYPE DOCUMENTS



Original Abstract Submitted

Techniques for predicting a missing value in an image-type document are disclosed. A system predicts the identity of a supplier associated with an image-type document in which the supplier's identity may not be extracted by text recognition. When a system determines that the supplier identity cannot be identified using a text recognition application, the system generates a set of machine learning model input features from features extracted from the image-type document to predict the supplier's identity. One input feature is a data file bounds feature indicating whether the image-type document is a scanned document or a non-scanned document. The system predicts a value for the supplier's identity based on the data file bounds value and additional feature values, including color channel characteristics and spatial characteristics of regions-of-interest. The system generates a mapping of values to defined attributes based in part on the predicted value for the supplier's identity.