18337316. DYNAMIC SELECTION FROM AMONG MULTIPLE CANDIDATE GENERATIVE MODELS WITH DIFFERING COMPUTATIONAL EFFICIENCIES simplified abstract (Google LLC)

From WikiPatents
Revision as of 11:15, 19 September 2024 by Wikipatents (talk | contribs) (Creating a new page)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

DYNAMIC SELECTION FROM AMONG MULTIPLE CANDIDATE GENERATIVE MODELS WITH DIFFERING COMPUTATIONAL EFFICIENCIES

Organization Name

Google LLC

Inventor(s)

Seungyeon Kim of New York NY (US)

Ankit Singh Rawat of Jersey City NJ (US)

Wittawat Jitkrittum of Jersey City NJ (US)

Hari Narasimhan of Mountain View CA (US)

Sashank Reddi of Fort Lee NJ (US)

Neha Gupta of New York NY (US)

Srinadh Bhojanapalli of Maplewoood NJ (US)

Aditya Menon of New York NY (US)

Manzil Zaheer of Mountain View CA (US)

Tal Schuster of New York NY (US)

Sanjiv Kumar of Jericho NY (US)

Toby Boyd of Columbus OH (US)

Zhifeng Chen of Sunnyvale CA (US)

Emanuel Taropa of Los Altos CA (US)

Vikram Kasivajhula of San Francisco CA (US)

Trevor Strohman of Sunnyvale CA (US)

Martin Baeuml of Zurich (CH)

Leif Schelin of Zurich (CH)

Yanping Huang of Mountain View CA (US)

DYNAMIC SELECTION FROM AMONG MULTIPLE CANDIDATE GENERATIVE MODELS WITH DIFFERING COMPUTATIONAL EFFICIENCIES - A simplified explanation of the abstract

This abstract first appeared for US patent application 18337316 titled 'DYNAMIC SELECTION FROM AMONG MULTIPLE CANDIDATE GENERATIVE MODELS WITH DIFFERING COMPUTATIONAL EFFICIENCIES

Simplified Explanation: The patent application discusses the selection of the most efficient generative model among multiple candidates to respond to requests, aiming to reduce latency and conserve computational resources.

Key Features and Innovation:

  • Selecting the most computationally efficient generative model for generating responses to requests.
  • Utilizing less efficient generative models selectively to avoid inaccurate or under-specified responses.
  • Mitigating computational and network inefficiencies by choosing the appropriate generative model for each request.

Potential Applications: This technology could be applied in various fields such as natural language processing, chatbots, virtual assistants, and automated customer service systems.

Problems Solved: The technology addresses issues related to latency, computational resource conservation, inaccurate responses, and under-specified generated content.

Benefits:

  • Reduced latency in generating responses.
  • Conservation of computational resources.
  • Improved accuracy and specificity of generated responses.
  • Mitigation of computational and network inefficiencies.

Commercial Applications: The technology could be utilized in AI-powered customer service platforms, virtual assistants for businesses, and automated content generation systems for various industries.

Prior Art: Prior research in the field of natural language processing, generative models, and AI-driven response systems could provide insights into similar technologies.

Frequently Updated Research: Stay updated on advancements in generative models, natural language processing, and AI technologies to enhance the efficiency and effectiveness of response generation systems.

Questions about the Technology: 1. How does the technology determine which generative model to use for each request? 2. What are the potential drawbacks of selectively utilizing less efficient generative models?


Original Abstract Submitted

Implementations disclose selecting, in response to receiving a request and from among multiple candidate generative models (e.g., multiple candidate large language models (LLMs)) with differing computational efficiencies, a particular generative model to utilize in generating a response to the request. Those implementations reduce latency and/or conserve computational resource(s) through selection, for various requests, of a more computationally efficient generative model for utilization in lieu of a less computationally efficient generative model. Further, those implementations seek to achieve such benefits, through utilization of more computationally efficient generative models, while also still selectively utilizing less computationally efficient generative models for certain requests to mitigate occurrences of a generated response being inaccurate and/or under-specified. This, in turn, can mitigate occurrences of computational and/or network inefficiencies that result from a user issuing a follow-up request to cure the inaccuracies and/or under-specification of a generated response.