20250190811. Method Training Multimodal Large Model Electronic De (BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY ., .)
METHOD FOR TRAINING MULTIMODAL LARGE MODEL AND ELECTRONIC DEVICE
Abstract: a method for training a multimodal large model includes: obtaining first training data and second training data; obtaining an initial multimodal large model, wherein the multimodal large model comprises a backbone network and multiple codec networks corresponding to the multiple non-textual modalities; and the multiple codec networks perform encoding and decoding based on a same multimodal word list; performing a joint training on the multiple codec networks and the multimodal word list based on the data under the multiple non-textual modalities; and training the backbone network based on the multimodal sample reference data and the sample generation data under the target task in the second training data. the multiple codec networks perform the encoding and decoding based on the same multimodal word list, which reduces the difficulty and the cost of the model training.
Inventor(s): Shuohuan Wang, Junyuan Shang, Yekun Chai, Yinqi Yang, Zhenyu Zhang, Yu Sun, Hua Wu, Haifeng Wang
CPC Classification: G06N3/096 (Transfer learning)
Search for rejections for patent application number 20250190811