US Patent Application 17733634. GENERATING AN INPAINTED IMAGE FROM A MASKED IMAGE USING A PATCH-BASED ENCODER simplified abstract

From WikiPatents
Jump to navigation Jump to search

GENERATING AN INPAINTED IMAGE FROM A MASKED IMAGE USING A PATCH-BASED ENCODER

Organization Name

MICROSOFT TECHNOLOGY LICENSING, LLC

Inventor(s)

Dongdong Chen of Redmond WA (US)

Xiyang Dai of Bellevue WA (US)

Yinpeng Chen of Sammamish WA (US)

Mengchen Liu of Redmond WA (US)

Lu Yuan of Redmond WA (US)

GENERATING AN INPAINTED IMAGE FROM A MASKED IMAGE USING A PATCH-BASED ENCODER - A simplified explanation of the abstract

This abstract first appeared for US patent application 17733634 titled 'GENERATING AN INPAINTED IMAGE FROM A MASKED IMAGE USING A PATCH-BASED ENCODER

Simplified Explanation

- This patent application describes a method for generating an inpainted image from a masked image using a patch-based encoder and an unquantized transformer. - The method involves receiving an image that has both a masked region and an unmasked region. - The received image is divided into patches, with some of the patches being masked patches. - Each patch is encoded into a feature vector using a patch-based encoder. - A predicted token is generated for each masked patch using the feature vector encoded from that patch, using a transformer. - A quantized vector of the masked patch is determined using the generated predicted token and a masked patch-specific codebook. - The determined quantized vector of the masked patch is included in a set of quantized vectors associated with all the patches. - Finally, an output image is generated from the set of quantized vectors using a decoder.


Original Abstract Submitted

The disclosure herein describes generating an inpainted image from a masked image using a patch-based encoder and an unquantized transformer. An image including a masked region and an unmasked region is received, and the received image is divided into a plurality of patches including masked patches. The plurality of patches is encoded into a plurality of feature vectors, wherein each patch is encoded to a feature vector. Using a transformer, a predicted token is generated for each masked patch using a feature vector encoded from the masked patch, and a quantized vector of the masked patch is determined using generated predicted token and a masked patch-specific codebook. The determined quantized vector of the masked patch is included into a set of quantized vectors associated with the plurality of patches, and an output image is generated from the set of quantized vectors using a decoder.