18470772. CAPACITY-BASED LOAD BALANCING IN SHARED RESOURCE POOL (MICROSOFT TECHNOLOGY LICENSING, LLC)
CAPACITY-BASED LOAD BALANCING IN SHARED RESOURCE POOL
Organization Name
MICROSOFT TECHNOLOGY LICENSING, LLC
Inventor(s)
Hemant Kumar of Bellevue WA US
Rakesh Kelkar of Bellevue WA US
Karthik Raman of Sammamish WA US
Sanjay Ramanujan of Sammamish WA US
Kevin Joseph Riehm of Woodbury WA US
Theodore Dragov Todorov of Seattle WA US
CAPACITY-BASED LOAD BALANCING IN SHARED RESOURCE POOL
This abstract first appeared for US patent application 18470772 titled 'CAPACITY-BASED LOAD BALANCING IN SHARED RESOURCE POOL
Original Abstract Submitted
A system provides capacity-based load balancing across model endpoints of a cloud-based artificial intelligence (AI) model. The system includes a consumption determination engine executable to determine a net resource consumption for processing tasks in a workload generated by a client application for input to the trained machine learning model. The system also includes a load balancer that determines a distribution of available resource capacity in a shared resource pool comprising compute resources at each of the multiple model endpoints. The load balancer allocates parallelizable tasks of the workload among the compute resources at the multiple model endpoints based on the net resource consumption of the tasks and on the distribution of available resource capacity in the shared resource pool.