MULTI-MODAL SOUND EFFECTS RECOMMENDATION

Organization Name

Inventor(s)

Julia Lepley Wilkins of Brooklyn NY (US)

Oriol Nieto-caballero of Oakland CA (US)

MULTI-MODAL SOUND EFFECTS RECOMMENDATION - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240220530 titled 'MULTI-MODAL SOUND EFFECTS RECOMMENDATION

Simplified Explanation: The patent application describes a system that recommends sound effects based on a multi-modal embedding space that incorporates visuals, text, and audio. An encoder generates a query embedding in this space, identifying relevant sound effect embeddings to provide recommendations.

Sound effects system using multi-modal embedding space
Encoder generates query embedding incorporating visuals, text, and audio
Recommends sound effects based on identified sound effect embeddings
Provides recommendations for sound effects corresponding to the query embedding

Potential Applications: This technology could be applied in various industries such as film production, video game development, virtual reality experiences, and audiovisual content creation.

Problems Solved: This technology streamlines the process of selecting appropriate sound effects for visual and textual content, enhancing the overall audiovisual experience for users.

Benefits: - Improved user experience with tailored sound effects - Time-saving in selecting sound effects for multimedia projects - Enhanced creativity and storytelling through audiovisual content

Commercial Applications: The technology could be utilized by multimedia production companies, video game developers, virtual reality experience creators, and content creators on platforms like YouTube and TikTok.

Questions about Sound Effects System: 1. How does the system determine the relevance of sound effect embeddings to the input query? 2. What are the potential limitations of using a multi-modal embedding space for recommending sound effects?

Original Abstract Submitted

a sound effects system recommends sound effects using a multi-modal embedding space for projecting visuals, text, and audio. given an input query comprising a visual (i.e., an image/video) and/or text, an encoder generates a query embedding in the multi-modal embedding space in which sound effects have been projected into sound effect embeddings. a relevant sound effect embedding in the multi-modal space is identified using the query embedding, and a recommendation is provided for a sound effect corresponding to the sound effect embedding.

Adobe Inc. (20240220530). MULTI-MODAL SOUND EFFECTS RECOMMENDATION simplified abstract

Contents

MULTI-MODAL SOUND EFFECTS RECOMMENDATION

Organization Name

Inventor(s)

MULTI-MODAL SOUND EFFECTS RECOMMENDATION - A simplified explanation of the abstract

Original Abstract Submitted

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools