Jump to content

18385408. Reducing Latency by Processing Parts of a Language Model Query in Parallel (Microsoft Technology Licensing, LLC)

From WikiPatents
Revision as of 10:23, 2 May 2025 by Wikipatents (talk | contribs) (Creating a new page)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Reducing Latency by Processing Parts of a Language Model Query in Parallel

Organization Name

Microsoft Technology Licensing, LLC

Inventor(s)

Sayan Dev Pathak of Kirkland WA US

Osama Abuelsorour of Menlo Park CA US

Christopher Hakan Basoglu of Everett WA US

Harini Kesavamoorthy of Bellevue WA US

Girish Milind Mahajan of Redmond WA US

Salman Mohammad Quazi of Mountain View CA US

Valeriy Viktorovich Kirshin of Kirkland WA US

Reducing Latency by Processing Parts of a Language Model Query in Parallel

This abstract first appeared for US patent application 18385408 titled 'Reducing Latency by Processing Parts of a Language Model Query in Parallel

Original Abstract Submitted

A technique partitions a user's original query into plural smaller component queries, each of which has a common part and an instance-specific part. The technique distributes the component queries to plural processor instances of a processor. The plural processor instances transform the respective component queries into query-component responses by acting in parallel, independent of each other. The technique generates a final response based on the query-component responses, e.g., by assembling the component-query responses into the final response. The technique reduces latency because the processor instances work on parts of the user's original query at the same time, rather than as a single stream of consecutive tokens. The plural processor instances have access to a shared cache memory, and utilize relevant data that has been computed in response to previous queries.

(Ad) Transform your business with AI in minutes, not months

Custom AI strategy tailored to your specific industry needs
Step-by-step implementation with measurable ROI
5-minute setup that requires zero technical skills
Get your AI playbook

Trusted by 1,000+ companies worldwide

Cookies help us deliver our services. By using our services, you agree to our use of cookies.