US Patent Application 18217745. TRAINING A SOUND EFFECT RECOMMENDATION NETWORK simplified abstract

From WikiPatents
Jump to navigation Jump to search

TRAINING A SOUND EFFECT RECOMMENDATION NETWORK

Organization Name

Sony Interactive Entertainment Inc.

Inventor(s)

Sudha Krishnamurthy of Foster City CA (US)

TRAINING A SOUND EFFECT RECOMMENDATION NETWORK - A simplified explanation of the abstract

This abstract first appeared for US patent application 18217745 titled 'TRAINING A SOUND EFFECT RECOMMENDATION NETWORK

Simplified Explanation

The patent application describes a machine learning algorithm that trains a network to recommend sound effects based on visual elements in an image.

  • The network takes a reference image, a positive audio embedding, and a negative audio embedding as inputs.
  • It uses a visual-to-audio correlation neural network to find a smaller distance between the positive audio embedding and the reference image compared to the negative audio embedding and the reference image.
  • The neural network is trained to identify visual elements in the reference image and map them to sound categories or subcategories in an audio database.


Original Abstract Submitted

A Sound effect recommendation network is trained using a machine learning algorithm with a reference image, a positive audio embedding and a negative audio embedding as inputs to train a visual-to-audio correlation neural network to output a smaller distance between the positive audio embedding and the reference image than the negative audio embedding and the reference image. The visual-to-audio correlation neural network is trained to identify one or more visual elements in the reference image and map the one or more visual elements to one or more sound categories or subcategories within an audio database.