18967529. MULTIMODAL DATA GENERATION (BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.)
Contents
MULTIMODAL DATA GENERATION
Organization Name
BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.
Inventor(s)
MULTIMODAL DATA GENERATION
This abstract first appeared for US patent application 18967529 titled 'MULTIMODAL DATA GENERATION
Original Abstract Submitted
A multimodal data generation method is provided. The method includes: inputting a query data sequence into a multimodal model, to obtain a plurality of tokens in a response data sequence, where a current token is generated through the following operations: inputting the query data sequence and a current response data sequence into the multimodal model, so that the multimodal model generates the current token based on the query data sequence and the current response data sequence, in response to determining that the current token belongs to a first data modality; or inputting the query data sequence and a current response data sequence into the multimodal model, so that the multimodal model denoises an initial token sequence based on the query data sequence and the current response data sequence, to generate a result token sequence, in response to determining that the current token belongs to a second data modality.