Google llc (20240256965). Instruction Fine-Tuning Machine-Learned Models Using Intermediate Reasoning Steps
Instruction Fine-Tuning Machine-Learned Models Using Intermediate Reasoning Steps
Organization Name
Inventor(s)
Hyung Won Chung of Mountain View CA US
Barret Zoph of San Francisco CA US
Dengyong Zhou of Redmond WA US
Le Hou of South Setauket NY US
Jason Weng Wei of Mountain View CA US
Siddhartha Brahma of San Jose CA US
Instruction Fine-Tuning Machine-Learned Models Using Intermediate Reasoning Steps
This abstract first appeared for US patent application 20240256965 titled 'Instruction Fine-Tuning Machine-Learned Models Using Intermediate Reasoning Steps
Original Abstract Submitted
an example method for training a machine-learned sequence processing model includes obtaining a plurality of training examples for training the machine-learned sequence processing model. for each respective training example of the plurality of training examples, the example method includes: obtaining a respective query associated with the respective training example; inputting the respective query to the machine-learned sequence processing model; obtaining, from the machine-learned sequence processing model a response to the respective query and a trace of intermediate states from the respective query to the response; evaluating the response using a ground truth response associated with the respective training example; evaluating the trace using a ground truth trace associated with the respective training example; and updating one or more parameters of the machine-learned sequence processing model based on the evaluation of the response and based on the evaluation of the trace.
- Google llc
- Hyung Won Chung of Mountain View CA US
- Barret Zoph of San Francisco CA US
- Dengyong Zhou of Redmond WA US
- Liam Fedus of Palo Alto CA US
- Shayne Longpre of Surrey CA
- Le Hou of South Setauket NY US
- Yi Tay of Singapore SG
- Jason Weng Wei of Mountain View CA US
- Siddhartha Brahma of San Jose CA US
- Quoc V. Le of Sunnyvale CA US
- G06N20/00
- CPC G06N20/00