18385408. Reducing Latency by Processing Parts of a Language Model Query in Parallel (Microsoft Technology Licensing, LLC)
Reducing Latency by Processing Parts of a Language Model Query in Parallel
Organization Name
Microsoft Technology Licensing, LLC
Inventor(s)
Sayan Dev Pathak of Kirkland WA US
Osama Abuelsorour of Menlo Park CA US
Christopher Hakan Basoglu of Everett WA US
Harini Kesavamoorthy of Bellevue WA US
Girish Milind Mahajan of Redmond WA US
Salman Mohammad Quazi of Mountain View CA US
Valeriy Viktorovich Kirshin of Kirkland WA US
Reducing Latency by Processing Parts of a Language Model Query in Parallel
This abstract first appeared for US patent application 18385408 titled 'Reducing Latency by Processing Parts of a Language Model Query in Parallel
Original Abstract Submitted
A technique partitions a user's original query into plural smaller component queries, each of which has a common part and an instance-specific part. The technique distributes the component queries to plural processor instances of a processor. The plural processor instances transform the respective component queries into query-component responses by acting in parallel, independent of each other. The technique generates a final response based on the query-component responses, e.g., by assembling the component-query responses into the final response. The technique reduces latency because the processor instances work on parts of the user's original query at the same time, rather than as a single stream of consecutive tokens. The plural processor instances have access to a shared cache memory, and utilize relevant data that has been computed in response to previous queries.
- Microsoft Technology Licensing, LLC
- Sayan Dev Pathak of Kirkland WA US
- Osama Abuelsorour of Menlo Park CA US
- Christopher Hakan Basoglu of Everett WA US
- Harini Kesavamoorthy of Bellevue WA US
- Girish Milind Mahajan of Redmond WA US
- Salman Mohammad Quazi of Mountain View CA US
- Valeriy Viktorovich Kirshin of Kirkland WA US
- G06F16/332
- G06F16/33
- CPC G06F16/3329
(Ad) Transform your business with AI in minutes, not months
Trusted by 1,000+ companies worldwide