18792282. DATA EXTRACTION FROM PRINTED DOCUMENTS (Royal Bank of Canada)
DATA EXTRACTION FROM PRINTED DOCUMENTS
Organization Name
Inventor(s)
Kashish Trusharkumar Mistry of Toronto (CA)
DATA EXTRACTION FROM PRINTED DOCUMENTS
This abstract first appeared for US patent application 18792282 titled 'DATA EXTRACTION FROM PRINTED DOCUMENTS
Original Abstract Submitted
A computer-implemented method for extracting data from printed documents comprises receiving a printed document and identifying the printed document as one of a structured form (including fully structured and semi-structured) and an unstructured form. Where the printed document is identified as a structured form, the method identifies first text features corresponding to keys for key-value pairs and identifies second text features that satisfy a proximity threshold (and optionally one or more key constraints) relative to the respective first text feature as the respective values of the respective key-value pairs, and records the values of the key-value pairs. Where the printed document is identified as a semi-structured form, the method may further comprise identifying at least one unstructured portion of the printed document and applying a trained machine learning model to the unstructured portion of the printed document to obtain additional values for additional key-value pairs.
(Ad) Transform your business with AI in minutes, not months
Trusted by 1,000+ companies worldwide