Jump to content

Intel corporation (20240185493). NETWORK FOR STRUCTURE-BASED TEXT-TO-IMAGE GENERATION simplified abstract

From WikiPatents

NETWORK FOR STRUCTURE-BASED TEXT-TO-IMAGE GENERATION

Organization Name

intel corporation

Inventor(s)

Peixi Xiong of Hillsboro OR (US)

Nilesh Jain of Portland OR (US)

NETWORK FOR STRUCTURE-BASED TEXT-TO-IMAGE GENERATION - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240185493 titled 'NETWORK FOR STRUCTURE-BASED TEXT-TO-IMAGE GENERATION

Simplified Explanation

The technology described in this patent application involves generating an image using a generator network by extracting structural relationship information from a text prompt, generating encoded text features, and combining them with encoded image features to create the output image.

  • Structural relationship information is extracted from a text prompt, including sentence features and token features.
  • Encoded text features are generated based on the sentence features and relation-related tokens identified from parsing text dependency information.
  • The output image is generated by combining the encoded text features and encoded image features using self-attention and cross-attention layers.
  • A gating function is applied to modify image features based on text features.
  • The self-attention and cross-attention layers are applied via a cross-modality network, and the gating function is applied via a residual gating network.
  • Relation-related tokens are further identified via an attention matrix.

Potential Applications

This technology could be applied in fields such as image generation, natural language processing, and artificial intelligence.

Problems Solved

This technology helps in generating images based on text prompts, allowing for more efficient and accurate image creation.

Benefits

The benefits of this technology include improved image generation capabilities, enhanced cross-modality processing, and better integration of text and image features.

Potential Commercial Applications

One potential commercial application of this technology could be in the development of automated image generation tools for creative industries.

Possible Prior Art

One possible prior art for this technology could be research on image generation using text prompts in the field of artificial intelligence.

What are the limitations of this technology in terms of scalability and complexity?

The limitations of this technology in terms of scalability and complexity include the computational resources required to process large amounts of text and image data simultaneously.

How does this technology compare to existing image generation methods in terms of accuracy and efficiency?

This technology offers improved accuracy and efficiency compared to existing image generation methods by leveraging text features to enhance the image creation process.


Original Abstract Submitted

technology as described herein provides for generating an image via a generator network, including extracting structural relationship information from a text prompt, wherein the structural relationship information includes sentence features and token features, generating encoded text features based on the sentence features and on relation-related tokens, wherein the relation-related tokens are identified based on parsing text dependency information in the token features, and generating an output image based on combining, via self attention and cross-attention layers, the encoded text features and encoded image features from an input image canvas. embodiments further include applying a gating function to modify image features based on text features. the self attention and cross-attention layers can be applied via a cross-modality network, the gating function can be applied via a residual gating network, and the relation-related tokens can be further identified via an attention matrix.

(Ad) Transform your business with AI in minutes, not months

Custom AI strategy for your specific industry
Step-by-step implementation with clear ROI
5-minute setup - no technical skills needed
Get your AI playbook
Cookies help us deliver our services. By using our services, you agree to our use of cookies.