US Patent Application 18313252. INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND STORAGE MEDIUM FOR EXTRACTING A NAMED ENTITY FROM A DOCUMENT simplified abstract
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND STORAGE MEDIUM FOR EXTRACTING A NAMED ENTITY FROM A DOCUMENT
Organization Name
Inventor(s)
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND STORAGE MEDIUM FOR EXTRACTING A NAMED ENTITY FROM A DOCUMENT - A simplified explanation of the abstract
This abstract first appeared for US patent application 18313252 titled 'INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND STORAGE MEDIUM FOR EXTRACTING A NAMED ENTITY FROM A DOCUMENT
Simplified Explanation
The patent application describes an information processing apparatus that converts text data from a document image into a token string and calculates the processing time required for natural language processing based on the token string. It then divides the token string into blocks, ensuring that adjacent blocks overlap, and selects estimation results for tokens in the overlap portion from each block.
- Information processing apparatus converts text data from a document image into a token string.
- Calculates the number of processing times required for natural language processing based on the token string.
- Divides the token string into blocks with overlapping portions between adjacent blocks.
- Selects estimation results for tokens in the overlap portion from each block.
Original Abstract Submitted
An information processing apparatus for converting text data from a document image read from a document into a token string and calculates a number of processing times necessary for performing processing in a natural language processing model based on the token string. Then, at the time of division, the information processing apparatus divides the token string into blocks so that at least a portion overlaps between adjacent blocks based on the calculated number of processing times and for each token belonging to the overlap portion between the adjacent blocks, selects one of estimation results obtained from each block.