18335845. PREDICTING MISSING ENTITY IDENTITIES IN IMAGE-TYPE DOCUMENTS (Oracle International Corporation)
Contents
PREDICTING MISSING ENTITY IDENTITIES IN IMAGE-TYPE DOCUMENTS
Organization Name
Oracle International Corporation
Inventor(s)
Vikram Majjiga Reddy of Cupertino CA (US)
PREDICTING MISSING ENTITY IDENTITIES IN IMAGE-TYPE DOCUMENTS
This abstract first appeared for US patent application 18335845 titled 'PREDICTING MISSING ENTITY IDENTITIES IN IMAGE-TYPE DOCUMENTS
Original Abstract Submitted
Techniques for predicting a missing value in an image-type document are disclosed. A system predicts the identity of a supplier associated with an image-type document in which the supplier's identity may not be extracted by text recognition. When a system determines that the supplier identity cannot be identified using a text recognition application, the system generates a set of machine learning model input features from features extracted from the image-type document to predict the supplier's identity. One input feature is a data file bounds feature indicating whether the image-type document is a scanned document or a non-scanned document. The system predicts a value for the supplier's identity based on the data file bounds value and additional feature values, including color channel characteristics and spatial characteristics of regions-of-interest. The system generates a mapping of values to defined attributes based in part on the predicted value for the supplier's identity.