18148226. WATERMARKING FOR SPEECH IN CONVERSATIONAL AI AND COLLABORATIVE SYNTHETIC CONTENT GENERATION SYSTEMS AND APPLICATIONS simplified abstract (NVIDIA Corporation)

From WikiPatents
Jump to navigation Jump to search

WATERMARKING FOR SPEECH IN CONVERSATIONAL AI AND COLLABORATIVE SYNTHETIC CONTENT GENERATION SYSTEMS AND APPLICATIONS

Organization Name

NVIDIA Corporation

Inventor(s)

Boris Ginsburg of Sunnyvale CA (US)

WATERMARKING FOR SPEECH IN CONVERSATIONAL AI AND COLLABORATIVE SYNTHETIC CONTENT GENERATION SYSTEMS AND APPLICATIONS - A simplified explanation of the abstract

This abstract first appeared for US patent application 18148226 titled 'WATERMARKING FOR SPEECH IN CONVERSATIONAL AI AND COLLABORATIVE SYNTHETIC CONTENT GENERATION SYSTEMS AND APPLICATIONS

The approaches described in this patent application involve inserting watermarks into synthesized content, such as audio content with synthesized speech, to create the illusion of speech by a digital avatar in a 3D virtual environment. A Text-to-Speech (TTS) generator, like a trained neural network, can be used to generate synthetic speech audio with an embedded audio watermark. This watermark can be detected by a collaborative content generation platform, indicating that the content contains synthesized speech. The watermark is imperceptible to the human ear during playback and can be made difficult to remove or alter by using a unique key known only to authorized entities.

  • Watermarks inserted into synthesized content
  • Use of Text-to-Speech (TTS) generator for synthetic speech audio
  • Detection of watermarks by collaborative content generation platform
  • Imperceptible watermark during playback
  • Use of unique key for watermark generation

Potential Applications: - Enhancing security in virtual environments - Verifying authenticity of synthesized content - Preventing unauthorized modification of synthesized speech

Problems Solved: - Ensuring the integrity of synthesized content - Providing a means to detect synthesized speech in virtual environments

Benefits: - Improved security for virtual content - Enhanced trust in synthesized speech applications - Protection against unauthorized alterations

Commercial Applications: Title: "Enhancing Security in Virtual Environments with Synthesized Speech Watermarks" This technology could be used in virtual reality applications, online gaming, virtual meetings, and digital storytelling platforms to ensure the authenticity and integrity of synthesized speech content.

Questions about the technology: 1. How does the unique key used for watermark generation contribute to the security of the synthesized content? 2. What are the potential implications of this technology for the future of virtual environments and digital communication?


Original Abstract Submitted

Approaches presented herein provide for insertion of watermarks into synthesized content, such as audio content that may include synthesized speech to appear to be spoken by a digital avatar in a 3D virtual environment. A Text-to-Speech (TTS) generator, such as a trained neural network, can be used to produce synthetic speech audio, which can have an audio watermark inserted therein. This watermark can be detected by a process of a collaborative content generation platform, for example, and an indication can be provided that the content contains synthesized speech. The presence of the audio watermark will generally not be detectable by the human ear during presentation. To make it difficult to remove or modify the watermark, the watermark can be generated using a key or other unique piece of data known only to authorized entities.