17932639. TRAINING AND USING A VECTOR ENCODER TO DETERMINE VECTORS FOR SUB-IMAGES OF TEXT IN AN IMAGE SUBJECT TO OPTICAL CHARACTER RECOGNITION simplified abstract (International Business Machines Corporation)
Contents
- 1 TRAINING AND USING A VECTOR ENCODER TO DETERMINE VECTORS FOR SUB-IMAGES OF TEXT IN AN IMAGE SUBJECT TO OPTICAL CHARACTER RECOGNITION
- 1.1 Organization Name
- 1.2 Inventor(s)
- 1.3 TRAINING AND USING A VECTOR ENCODER TO DETERMINE VECTORS FOR SUB-IMAGES OF TEXT IN AN IMAGE SUBJECT TO OPTICAL CHARACTER RECOGNITION - A simplified explanation of the abstract
- 1.4 Simplified Explanation
- 1.5 Potential Applications
- 1.6 Problems Solved
- 1.7 Benefits
- 1.8 Potential Commercial Applications
- 1.9 Possible Prior Art
- 1.10 Original Abstract Submitted
TRAINING AND USING A VECTOR ENCODER TO DETERMINE VECTORS FOR SUB-IMAGES OF TEXT IN AN IMAGE SUBJECT TO OPTICAL CHARACTER RECOGNITION
Organization Name
International Business Machines Corporation
Inventor(s)
Yi Chen Zhong of Shanghai (CN)
TRAINING AND USING A VECTOR ENCODER TO DETERMINE VECTORS FOR SUB-IMAGES OF TEXT IN AN IMAGE SUBJECT TO OPTICAL CHARACTER RECOGNITION - A simplified explanation of the abstract
This abstract first appeared for US patent application 17932639 titled 'TRAINING AND USING A VECTOR ENCODER TO DETERMINE VECTORS FOR SUB-IMAGES OF TEXT IN AN IMAGE SUBJECT TO OPTICAL CHARACTER RECOGNITION
Simplified Explanation
The abstract describes a computer program product, system, and method for training and using a vector encoder to determine vectors for sub-images of text in an image for optical character recognition. The vector encoder encodes images representing text into vectors in a vector space, where similar text images have cohesive vectors and dissimilar text images have less cohesive vectors. Sub-images of an input image containing text are processed to generate sub-image vectors using the vector encoder. A search vector is created for search text, and optical character recognition is applied to regions of the input image with sub-image vectors matching the search vector.
- Vector encoder trained to encode text images into vectors in a vector space
- Sub-images of input image with text identified and processed to generate sub-image vectors
- Search vector created for search text
- Optical character recognition applied to regions of input image with matching sub-image vectors
Potential Applications
This technology can be applied in:
- Document scanning and digitization
- Image search engines
- Text extraction from images
Problems Solved
This technology helps in:
- Efficiently identifying and extracting text from images
- Improving accuracy of optical character recognition
- Enhancing search capabilities for text within images
Benefits
The benefits of this technology include:
- Faster and more accurate text extraction from images
- Improved search functionality for text within images
- Enhanced document digitization processes
Potential Commercial Applications
This technology can be utilized in various commercial applications such as:
- Document management systems
- Image editing software
- Automated data entry systems
Possible Prior Art
One possible prior art for this technology could be the use of neural networks for image recognition and text extraction.
What are the limitations of the technology described in the patent application?
The limitations of the technology described in the patent application include:
- Dependency on the accuracy of the vector encoder for text image encoding
- Sensitivity to variations in text fonts and styles
How does this technology compare to existing optical character recognition systems?
This technology differs from existing optical character recognition systems by utilizing a vector encoder to determine vectors for sub-images of text in an image, improving the accuracy and efficiency of text extraction from images.
Original Abstract Submitted
Provided are a computer program product, system, and method for training and using a vector encoder to determine vectors for sub-images of text in an image to subject to optical character recognition. A vector encoder is trained to encode images representing text into vectors in a vector space. Vectors of images representing similar text have a high degree of cohesion in the vector space. Vectors of images representing dissimilar text have a low degree of cohesion in the vector space. An input image is processed to determine sub-images of the input image that bound text represented in the input image. The sub-images are inputted to the vector encoder to output sub-image vectors. The vector encoder generates a search vector for search text. Optical character recognition is applied to at least one region of the input image including the sub-images having sub-image vectors matching the search vector.