18074160. INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND STORAGE MEDIUM simplified abstract (CANON KABUSHIKI KAISHA)

From WikiPatents
Jump to navigation Jump to search

INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND STORAGE MEDIUM

Organization Name

CANON KABUSHIKI KAISHA

Inventor(s)

TOMOAKI Higo of Kanagawa (JP)

INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND STORAGE MEDIUM - A simplified explanation of the abstract

This abstract first appeared for US patent application 18074160 titled 'INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND STORAGE MEDIUM

Simplified Explanation

The present disclosure describes a technique for generating a document image that can be shared without revealing confidential information. The process involves separating a scanned document image into character information and background information, and then extracting named entities and their attributes from the document. The named entities are replaced with attribute tags to create a new document image. The extracted named entities and the attribute tag document data are stored in a database for future reference.

  • Technique for generating a disclosable document image without using confidential information
  • Document image is separated into character information and background information
  • Named entity extraction processing is performed to extract named entities and their attributes
  • Named entities in the document image are replaced with attribute tags
  • Extracted named entities and attribute tag document data are stored in a database

Potential applications of this technology:

  • Sharing sensitive documents without revealing confidential information
  • Redacting confidential information from documents for public release
  • Protecting privacy by anonymizing personal information in documents

Problems solved by this technology:

  • Ensures confidentiality of sensitive information in document images
  • Facilitates sharing and distribution of documents without compromising privacy
  • Reduces the risk of accidental disclosure of confidential information

Benefits of this technology:

  • Enables efficient and secure sharing of documents containing confidential information
  • Simplifies the process of redacting sensitive information from documents
  • Enhances privacy protection by anonymizing personal data in documents


Original Abstract Submitted

The present disclosure relates to a technique of generating a disclosable document image based on a document image including confidential information, without using the confidential information. A document input unit obtains a document image scanned with a scanner, separates the document image into character information and background information, and then outputs them to an extraction unit. The extraction unit performs named entity extraction processing on the obtained character information and background information to extract named entities in the document and attributes thereof, and output an extraction result to a generation unit. The generation unit replaces the named entities in the document image with attribute tags and obtains superimposable ranges to generate attribute tag document data. A management unit registers the received extraction result of the named entities and the attribute tag document data in a database.