18533685. INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING SYSTEM, INFORMATION PROCESSING METHOD, AND STORAGE MEDIUM simplified abstract (CANON KABUSHIKI KAISHA)

From WikiPatents
Jump to navigation Jump to search

INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING SYSTEM, INFORMATION PROCESSING METHOD, AND STORAGE MEDIUM

Organization Name

CANON KABUSHIKI KAISHA

Inventor(s)

KEN Achiwa of Kanagawa (JP)

INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING SYSTEM, INFORMATION PROCESSING METHOD, AND STORAGE MEDIUM - A simplified explanation of the abstract

This abstract first appeared for US patent application 18533685 titled 'INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING SYSTEM, INFORMATION PROCESSING METHOD, AND STORAGE MEDIUM

The patent application aims to accurately extract character strings corresponding to extraction-target items, even when the character string ranges of multiple items overlap in named entity recognition tasks.

  • Training model extracts character strings for each item in a document
  • Output character strings for all items in the document
  • Re-extract character strings for items where no corresponding string was initially extracted
    • Potential Applications:**

- Named entity recognition systems - Document analysis and information extraction tools

    • Problems Solved:**

- Overlapping character string ranges in extraction tasks - Ensuring accuracy in named entity recognition

    • Benefits:**

- Improved accuracy in extracting character strings - Enhanced efficiency in document analysis

    • Commercial Applications:**
  • Optimized Character String Extraction Technology for Document Analysis and Named Entity Recognition*
    • Prior Art:**

Research on named entity recognition systems and document analysis tools could provide insights into similar technologies.

    • Frequently Updated Research:**

Stay updated on advancements in named entity recognition and document analysis technologies for potential improvements in character string extraction methods.

    • Questions about Character String Extraction Technology:**

1. How does the training model ensure accurate extraction of character strings for each item? 2. What are the key challenges in dealing with overlapping character string ranges in extraction tasks?


Original Abstract Submitted

To make it possible to extract a character string corresponding to each extraction-target item with accuracy even in a case where the character string ranges of a plurality of extraction-target items overlap one another in the task of named entity recognition. By using a training model trained to extract a character string corresponding to each of a plurality of items within a document, a character string corresponding to each of the plurality of items is extracted and output for an input document image. Then, a character string corresponding to an item among the plurality of items, for which a corresponding character string is not extracted, is re-extracted from the character string output by the first extracting.