Qualcomm incorporated (20250131606). HARDWARE-AWARE EFFICIENT ARCHITECTURES FOR TEXT-TO-IMAGE DIFFUSION MODELS
HARDWARE-AWARE EFFICIENT ARCHITECTURES FOR TEXT-TO-IMAGE DIFFUSION MODELS
Organization Name
Inventor(s)
Shubhankar Mangesh Borse of San Diego CA US
Risheek Garrepalli of San Diego CA US
Jisoo Jeong of San Diego CA US
Shreya Kadambi of San Diego CA US
Munawar Hayat of San Diego CA US
Fatih Murat Porikli of San Diego CA US
HARDWARE-AWARE EFFICIENT ARCHITECTURES FOR TEXT-TO-IMAGE DIFFUSION MODELS
This abstract first appeared for US patent application 20250131606 titled 'HARDWARE-AWARE EFFICIENT ARCHITECTURES FOR TEXT-TO-IMAGE DIFFUSION MODELS
Original Abstract Submitted
a processor-implemented method includes receiving a text-semantic input at a first stage of a neural network, including a first convolutional block and no attention layers. the method receives, at a second stage, a first output from the first stage. the second stage comprises a first down sampling block including a first attention layer and a second convolutional block. the method receives, at a third stage, a second output from the second stage. the third stage comprises a first up sampling block including a second attention layer and a first set of convolutional blocks. the method receives, at a fourth stage, the first output from the first stage and a third output from the third stage. the fourth stage comprises a second up sampling block including no attention layers and a second set of convolutional blocks. the method generates an image at the fourth stage, based on the text-semantic input.