18000285. TRAINING MASKED AUTOENCODERS FOR IMAGE INPAINTING simplified abstract (Microsoft Technology Licensing, LLC)
TRAINING MASKED AUTOENCODERS FOR IMAGE INPAINTING
Organization Name
Microsoft Technology Licensing, LLC
Inventor(s)
Dongdong Chen of Redmond WA (US)
TRAINING MASKED AUTOENCODERS FOR IMAGE INPAINTING - A simplified explanation of the abstract
This abstract first appeared for US patent application 18000285 titled 'TRAINING MASKED AUTOENCODERS FOR IMAGE INPAINTING
The abstract describes a method for training an encoder network to inpaint images with masked portions. The process involves encoding a visible portion of a masked input image, decoding it into pixel regression output and feature prediction output, and determining losses to train the encoding process.
- Encoder network trained to inpaint images with masked portions
- Encoding process encodes visible portion of masked input image into token data
- Decoded token data used to generate pixel regression output and feature prediction output
- Pixel regression loss calculated using pixel regression output and unmasked image data
- Feature prediction loss calculated using feature prediction output and ground truth encoding output
- Training process focuses on encoding structural features of input images into token data
Potential Applications: - Image editing software - Forensic analysis tools - Medical imaging technology
Problems Solved: - Efficiently inpainting images with missing portions - Enhancing image editing capabilities - Improving image analysis accuracy
Benefits: - Enhanced image restoration capabilities - Improved image processing efficiency - Increased accuracy in image analysis tasks
Commercial Applications: Title: Advanced Image Inpainting Technology for Various Industries This technology can be utilized in industries such as: - Photography - Graphic design - Medical imaging
Questions about Image Inpainting Technology: 1. How does this technology improve image editing processes?
This technology enhances image editing processes by efficiently inpainting missing portions of images, resulting in seamless edits.
2. What are the potential applications of this technology in the medical field?
This technology can be used in medical imaging to enhance diagnostic accuracy by inpainting missing areas in medical images.
Original Abstract Submitted
The disclosure herein describes training an encoder network to inpaint images with masked portions. A primary encoding process is used to encode a visible portion of a masked input image into encoded token data. The encoded token data is then decoded into both pixel regression output and feature prediction output, wherein both outputs include inpainted image data associated with the masked portion of the masked input image. A pixel regression loss is determined using the pixel regression output and pixel data of an unmasked version of the masked input image. A feature prediction loss is determined using the feature prediction output and ground truth encoding output of the unmasked version of the masked input image. The primary encoding process is then trained using the pixel regression loss and the feature prediction loss, whereby the primary encoding process is trained to encode structural features of input images into encoded token data.