International business machines corporation (20240096121). TRAINING AND USING A VECTOR ENCODER TO DETERMINE VECTORS FOR SUB-IMAGES OF TEXT IN AN IMAGE SUBJECT TO OPTICAL CHARACTER RECOGNITION simplified abstract
Contents
- 1 TRAINING AND USING A VECTOR ENCODER TO DETERMINE VECTORS FOR SUB-IMAGES OF TEXT IN AN IMAGE SUBJECT TO OPTICAL CHARACTER RECOGNITION
- 1.1 Organization Name
- 1.2 Inventor(s)
- 1.3 TRAINING AND USING A VECTOR ENCODER TO DETERMINE VECTORS FOR SUB-IMAGES OF TEXT IN AN IMAGE SUBJECT TO OPTICAL CHARACTER RECOGNITION - A simplified explanation of the abstract
- 1.4 Simplified Explanation
- 1.5 Potential Applications
- 1.6 Problems Solved
- 1.7 Benefits
- 1.8 Potential Commercial Applications
- 1.9 Possible Prior Art
- 1.10 Original Abstract Submitted
TRAINING AND USING A VECTOR ENCODER TO DETERMINE VECTORS FOR SUB-IMAGES OF TEXT IN AN IMAGE SUBJECT TO OPTICAL CHARACTER RECOGNITION
Organization Name
international business machines corporation
Inventor(s)
Yi Chen Zhong of Shanghai (CN)
TRAINING AND USING A VECTOR ENCODER TO DETERMINE VECTORS FOR SUB-IMAGES OF TEXT IN AN IMAGE SUBJECT TO OPTICAL CHARACTER RECOGNITION - A simplified explanation of the abstract
This abstract first appeared for US patent application 20240096121 titled 'TRAINING AND USING A VECTOR ENCODER TO DETERMINE VECTORS FOR SUB-IMAGES OF TEXT IN AN IMAGE SUBJECT TO OPTICAL CHARACTER RECOGNITION
Simplified Explanation
The abstract describes a computer program product, system, and method for training and using a vector encoder to determine vectors for sub-images of text in an image to subject to optical character recognition. The vector encoder is trained to encode images representing text into vectors in a vector space, where vectors of images representing similar text have a high degree of cohesion in the vector space, and vectors of images representing dissimilar text have a low degree of cohesion in the vector space. The input image is processed to determine sub-images that bound text, which are then inputted to the vector encoder to output sub-image vectors. A search vector is generated for search text, and optical character recognition is applied to at least one region of the input image including the sub-images with sub-image vectors matching the search vector.
- Trained vector encoder encodes images of text into vectors in a vector space.
- Vectors of images representing similar text have high cohesion in the vector space.
- Vectors of images representing dissimilar text have low cohesion in the vector space.
- Input image is processed to determine sub-images containing text.
- Sub-images are inputted to the vector encoder to output sub-image vectors.
- Search vector is generated for search text.
- Optical character recognition is applied to regions of the input image with matching sub-image vectors.
Potential Applications
This technology can be applied in document scanning, text recognition in images, and automated data extraction from images.
Problems Solved
This technology solves the problem of accurately identifying and extracting text from images, especially in scenarios where traditional OCR methods may struggle due to variations in text appearance.
Benefits
The benefits of this technology include improved accuracy in text recognition, faster data extraction from images, and enhanced automation in document processing tasks.
Potential Commercial Applications
Potential commercial applications of this technology include document management systems, automated data entry software, and image-based text search engines.
Possible Prior Art
One possible prior art is the use of convolutional neural networks for image recognition and text extraction, but this technology specifically focuses on training a vector encoder for text images and applying OCR based on vector similarity.
Unanswered Questions
How does this technology compare to existing OCR methods in terms of accuracy and efficiency?
This article does not provide a direct comparison between this technology and existing OCR methods.
What are the potential limitations or challenges in implementing this technology on a large scale?
The article does not address the potential limitations or challenges in implementing this technology on a large scale.
Original Abstract Submitted
provided are a computer program product, system, and method for training and using a vector encoder to determine vectors for sub-images of text in an image to subject to optical character recognition. a vector encoder is trained to encode images representing text into vectors in a vector space. vectors of images representing similar text have a high degree of cohesion in the vector space. vectors of images representing dissimilar text have a low degree of cohesion in the vector space. an input image is processed to determine sub-images of the input image that bound text represented in the input image. the sub-images are inputted to the vector encoder to output sub-image vectors. the vector encoder generates a search vector for search text. optical character recognition is applied to at least one region of the input image including the sub-images having sub-image vectors matching the search vector.