18539764. MULTI-TIME-SCALE NEURAL AUDIO CODEC STREAMS (Cisco Technology, Inc.)
MULTI-TIME-SCALE NEURAL AUDIO CODEC STREAMS
Organization Name
Inventor(s)
Amir Salah Abdelsamie Abdelwahed of Edinburgh GB
Hui-Ling Lu of Palo Alto CA US
Yusuf Ziya Isik of Edinburgh GB
David Guoqing Zhang of Fremont CA US
Samer Lutfi Hijazi of San Jose CA US
MULTI-TIME-SCALE NEURAL AUDIO CODEC STREAMS
This abstract first appeared for US patent application 18539764 titled 'MULTI-TIME-SCALE NEURAL AUDIO CODEC STREAMS
Original Abstract Submitted
A data-driven audio codec system that involves producing multiple compressed streams comprising encoded information (e.g., codeword indices) at different time scales (time intervals or frequency). This may allow for separation of different properties of speech, such as content and aspects of style (prosody), into the different compressed streams without explicitly enforcing it, i.e., in an unsupervised manner. Speech audio is encoded to produce a plurality of encoded streams comprising encoded information for the speech audio at different time scales. The plurality of encoded streams are decoded to generate output audio.
- Cisco Technology, Inc.
- Rafal Pilarczyk of Plock PL
- Amir Salah Abdelsamie Abdelwahed of Edinburgh GB
- Hui-Ling Lu of Palo Alto CA US
- Ivana Balic of Studen BE CH
- Yusuf Ziya Isik of Edinburgh GB
- David Guoqing Zhang of Fremont CA US
- Xuehong Mao of San Jose CA US
- Samer Lutfi Hijazi of San Jose CA US
- G10L21/043
- G10L19/00
- CPC G10L21/043