Jump to content

20250190811. Method Training Multimodal Large Model Electronic De (BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY ., .)

From WikiPatents
Revision as of 06:01, 12 June 2025 by Wikipatents (talk | contribs) (Automated patent report)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

METHOD FOR TRAINING MULTIMODAL LARGE MODEL AND ELECTRONIC DEVICE

Abstract: a method for training a multimodal large model includes: obtaining first training data and second training data; obtaining an initial multimodal large model, wherein the multimodal large model comprises a backbone network and multiple codec networks corresponding to the multiple non-textual modalities; and the multiple codec networks perform encoding and decoding based on a same multimodal word list; performing a joint training on the multiple codec networks and the multimodal word list based on the data under the multiple non-textual modalities; and training the backbone network based on the multimodal sample reference data and the sample generation data under the target task in the second training data. the multiple codec networks perform the encoding and decoding based on the same multimodal word list, which reduces the difficulty and the cost of the model training.

Inventor(s): Shuohuan Wang, Junyuan Shang, Yekun Chai, Yinqi Yang, Zhenyu Zhang, Yu Sun, Hua Wu, Haifeng Wang

CPC Classification: G06N3/096 (Transfer learning)

Search for rejections for patent application number 20250190811


Cookies help us deliver our services. By using our services, you agree to our use of cookies.