TUNING LARGE LANGUAGE MODELS FOR NEXT SENTENCE PREDICTION

Abstract: a processor-implemented method includes generating a group of single-head attention (sha) operations based on a number of attention heads in a multi-head attention (mha) mechanism, each sha operation corresponding to a respective attention head of a group of attention heads associated with the mha mechanism. the method also includes parallelly executing each of one of the group of sha operations independently between hardware blocks of a device associated with a neural network model. the method further includes generating an mha output based on parallelly executing each of one of the group of sha operations.

Inventor(s): Carl Alexander BLACKLOCK, Sahil GUPTA, Veluppillai ARULESAN, Jeffrey Baginsky GEHLHAAR, Kanghwan JANG, Pranav SHRESTHA, Hariharan SUKUMAR, Jian WANG

CPC Classification: G06N3/08 (Learning methods)

Search for rejections for patent application number 20250173561

20250173561. Tuning Large Language Mo (QUALCOMM Incorporated)

TUNING LARGE LANGUAGE MODELS FOR NEXT SENTENCE PREDICTION

Transform your business with AI in minutes, not months