USING A LOGICAL TREE STRUCTURE TO IDENTIFY A FOUNDATION MODEL INFERENCING SERVER FOR FULFILLING AN INFERENCING REQUEST

Organization Name

International Business Machines Corporation

Inventor(s)

Mudhakar Srivatsa of White Plains NY (US)

Satishkumar Sadagopan of Leawood KS (US)

Utpal Mangla of Toronto (CA)

Dinesh C. Verma of New Castle NY (US)

Gerald Coon of Durham NC (US)

Mathews Thomas of Flower Mound TX (US)

USING A LOGICAL TREE STRUCTURE TO IDENTIFY A FOUNDATION MODEL INFERENCING SERVER FOR FULFILLING AN INFERENCING REQUEST - A simplified explanation of the abstract

This abstract first appeared for US patent application 18083300 titled 'USING A LOGICAL TREE STRUCTURE TO IDENTIFY A FOUNDATION MODEL INFERENCING SERVER FOR FULFILLING AN INFERENCING REQUEST

The abstract of this patent application describes a computer-implemented method that involves organizing downstream task models of a foundation model into a logical tree structure. This structure is used to identify an inferencing server to fulfill inferencing requests efficiently.

The method determines downstream task models and arranges them into a logical tree structure.
Each node in the tree represents a sequence of layers of a downstream task model.
When a cache miss occurs during inferencing, the logical tree structure helps identify an inferencing server that meets specific prerequisites.
The identified server is then tasked with fulfilling the inferencing request.

Potential Applications: - This technology can be applied in machine learning systems to optimize inferencing processes. - It can enhance the efficiency of AI models by streamlining inferencing tasks.

Problems Solved: - Addresses the issue of cache misses during inferencing in machine learning systems. - Improves the overall performance and speed of inferencing processes.

Benefits: - Increases the speed and efficiency of inferencing tasks. - Enhances the performance of AI models by reducing cache misses.

Commercial Applications: Optimizing inferencing processes in various industries such as healthcare, finance, and e-commerce can lead to improved decision-making and operational efficiency.

Questions about the Technology: 1. How does this technology improve the efficiency of inferencing tasks in machine learning systems? 2. What specific prerequisites does the identified inferencing server need to fulfill inferencing requests efficiently?

Frequently Updated Research: Stay updated on advancements in machine learning algorithms and inferencing optimization techniques to further enhance the performance of this technology.

Original Abstract Submitted

A computer-implemented method, according to one embodiment, includes determining a plurality of downstream task models of a foundation model, and arranging the downstream task models into a logical tree structure. Each node of the logical tree structure represents a sequence of layers of an associated one of the downstream task models. In response to a determination that a request for inferencing on a target model has resulted in a cache miss occurring, the logical tree structure is used to identify an inferencing server that satisfies at least a first predetermined prerequisite for fulfilling the inferencing request. The method further includes causing the identified inferencing server to fulfill the inferencing request. A computer program product, according to one embodiment, includes a computer readable storage medium having program instructions embodied therewith. The program instructions are readable and/or executable by a computer to cause the computer to perform the foregoing method.

18083300. USING A LOGICAL TREE STRUCTURE TO IDENTIFY A FOUNDATION MODEL INFERENCING SERVER FOR FULFILLING AN INFERENCING REQUEST simplified abstract (International Business Machines Corporation)

Contents

USING A LOGICAL TREE STRUCTURE TO IDENTIFY A FOUNDATION MODEL INFERENCING SERVER FOR FULFILLING AN INFERENCING REQUEST

Organization Name

Inventor(s)

USING A LOGICAL TREE STRUCTURE TO IDENTIFY A FOUNDATION MODEL INFERENCING SERVER FOR FULFILLING AN INFERENCING REQUEST - A simplified explanation of the abstract

Original Abstract Submitted

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools