18065352. DOCUMENT IMAGE TEMPLATE MATCHING simplified abstract (International Business Machines Corporation)

From WikiPatents
Jump to navigation Jump to search

DOCUMENT IMAGE TEMPLATE MATCHING

Organization Name

International Business Machines Corporation

Inventor(s)

Ang Yi of Beijing (CN)

Jing Zhang of Beijing (CN)

Hai Cheng Wang of Beijing (CN)

Jun Hong Zhao of ShangDi (CN)

Rajesh M. Desai of San Jose CA (US)

Yang Zhong Li of Beijing (CN)

Ye Chen of Beijing (CN)

DOCUMENT IMAGE TEMPLATE MATCHING - A simplified explanation of the abstract

This abstract first appeared for US patent application 18065352 titled 'DOCUMENT IMAGE TEMPLATE MATCHING

The abstract of the patent application describes a computer-implemented method that merges multiple pages of a document into a single document image. The program code then processes this image to identify structural elements and textual content. It compares the structural elements to a group of document templates in a database to find the closest match.

  • The program code generates a graph structure from the document image, representing the document visually and conceptually.
  • It uses this structure to identify the document template that best matches the document.

Potential Applications: This technology could be used in document management systems to streamline the process of organizing and categorizing documents. It could also be applied in content analysis and information retrieval systems.

Problems Solved: This technology addresses the challenge of efficiently categorizing and matching documents in a large database. It helps automate the process of identifying the most relevant document template for a given document.

Benefits: The technology saves time and effort in document management by automating the process of matching documents to templates. It improves accuracy and efficiency in document organization and retrieval.

Commercial Applications: This technology could be valuable for companies that deal with large volumes of documents, such as legal firms, financial institutions, and government agencies. It could also be useful for software developers creating document management solutions.

Prior Art: Researchers and developers in the field of document analysis and information retrieval may have explored similar techniques for document categorization and template matching. It would be beneficial to review existing literature and patents in this area.

Frequently Updated Research: Researchers in the field of artificial intelligence and machine learning are constantly exploring new methods for document analysis and content recognition. Staying informed about the latest advancements in this field could provide valuable insights for further improving this technology.

Questions about the Technology: 1. How does this technology compare to traditional methods of document categorization and template matching? 2. What are the potential limitations or challenges of implementing this technology in real-world document management systems?


Original Abstract Submitted

Computer implemented methods, systems, and computer program products include program code executing on a processor(s) that merges a document comprising multiple pages into a single document image. The program code processes the single document image to identify structural elements and textual content. The program code compares the structural elements of the single document image to other structural elements of a group of document templates stored in a database to identify a subset of the group of documents templates with a threshold number of similarities to the single document image. The program code generates, from the single document image, a graph structure representing the document, where the graph structure comprises visual information and connections related to the structural elements and concepts comprising the textual content. The program code uses the structure to identify a document template that is a closest match to the document.