US Patent Application 18350464. Augmentation of Audiographic Images for Improved Machine Learning simplified abstract

From WikiPatents
Jump to navigation Jump to search

Augmentation of Audiographic Images for Improved Machine Learning

Organization Name

Google LLC


Inventor(s)

Daniel Sung-Joon Park of Sunnyvale CA (US)

Quoc Le of Sunnyvale CA (US)

William Chan of Toronto (CA)

Ekin Dogus Cubuk of San Francisco CA (US)

Barret Zoph of San Francisco CA (US)

Yu Zhang of Mountain View CA (US)

Chung-Cheng Chiu of Mountain View CA (US)

Augmentation of Audiographic Images for Improved Machine Learning - A simplified explanation of the abstract

This abstract first appeared for US patent application 18350464 titled 'Augmentation of Audiographic Images for Improved Machine Learning

Simplified Explanation

The patent application is about systems and methods that generate augmented training data for machine-learned models using audiographic images.

  • The patent introduces new augmentation techniques applied to audiographic images to improve model performance.
  • The augmentation operations are performed directly on the audiographic image, rather than the raw audio data.
  • The audiographic images can be spectrograms or filter bank sequences.
  • The innovation aims to enhance the training data for machine learning models.


Original Abstract Submitted

Generally, the present disclosure is directed to systems and methods that generate augmented training data for machine-learned models via application of one or more augmentation techniques to audiographic images that visually represent audio signals. In particular, the present disclosure provides a number of novel augmentation operations which can be performed directly upon the audiographic image (e.g., as opposed to the raw audio data) to generate augmented training data that results in improved model performance. As an example, the audiographic images can be or include one or more spectrograms or filter bank sequences.