ASAPP, INC. (20250104692). TEXT-TO-AUDIO CONVERSION WITH BYTE-ENCODING VECTORS
TEXT-TO-AUDIO CONVERSION WITH BYTE-ENCODING VECTORS
Organization Name
Inventor(s)
Justin Robert Lovelace of Ithaca NY US
Kilian Quirin Weinberger of Ithaca NY US
Kwangyoun Kim of San Jose CA US
TEXT-TO-AUDIO CONVERSION WITH BYTE-ENCODING VECTORS
This abstract first appeared for US patent application 20250104692 titled 'TEXT-TO-AUDIO CONVERSION WITH BYTE-ENCODING VECTORS
Original Abstract Submitted
a diffusion model may be used to generate an audio signal from text. the diffusion model may process received text and noise vectors to compute encoded audio vectors that correspond to the text. the encoded audio vectors may be decoded to generate an audio signal of a person speaking the text that may be presented to a user. the diffusion model may process a sequence of byte-encoding vectors corresponding to the text, and the use of the byte-encoding vectors may allow for the generation of higher quality audio signals. in some implementations, prompt audio of a person may also be used to generate an audio signal that resembles the speech of that person.