Patent Application 17223859 - DYNAMICALLY SCALABLE MACHINE LEARNING MODEL

Title: DYNAMICALLY SCALABLE MACHINE LEARNING MODEL GENERATION AND RETRAINING THROUGH CONTAINERIZATION
Application Information

Invention Title: DYNAMICALLY SCALABLE MACHINE LEARNING MODEL GENERATION AND RETRAINING THROUGH CONTAINERIZATION
Application Number: 17223859
Submission Date: 2025-05-16T00:00:00.000Z
Effective Filing Date: 2021-04-06T00:00:00.000Z
Filing Date: 2021-04-06T00:00:00.000Z
National Class: 706
National Sub-Class: 012000
Examiner Employee Number: 95014
Art Unit: 2124
Tech Center: 2100
Rejection Summary

102 Rejections: 0
103 Rejections: 4
Cited Patents

The following patents were cited in the rejection:
Office Action Text


    Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Remarks
	This Office Action is in response to applicant’s amendment filed on April 9, 2025, under which claims 1-20 are pending and under consideration. 

Response to Arguments
	Applicant’s amendments have overcome the previous § 112(b) rejection. Therefore, the previous § 112(b) rejection has been withdrawn.
	Applicant’s arguments directed to the § 103 rejection have been fully considered but are not deemed to be persuasive. The grounds of rejection have been updated to account for the amended claim language, but the claims remain rejected over the same references as those previously applied.
	Applicant argues:
Specifically, closer examination of the Wang reference reveals that the features in Table II do not include the type of the machine learning algorithm. Rather, the type of the machine learning algorithm in Wang is fixed. Specifically, the type is a neural network, and more specifically a convolutional neural network. The layers described in Table II are not indicative of different types of machine learning algorithms, but instead are different layers within the single, fixed type of machine learning algorithm. More specifically the features themselves are listed in the left column of Table II and do not include any indication of a type of machine learning model. The right column, where the layer types are described, merely indicate which layers of the single, fixed type of machine learning algorithm that the corresponding feature is used for. For example, "Batch Size" is a feature used by all layers of the convolutional neural network, whereas "Kernel Size" is only used by the Convolutional and 2D layers of the convolutional neural network. 
Thus, there is no teaching of using the type of the machine learning model as a feature that is input to another machine learning model. The other cited references fail to remedy this defect.   
(Applicant’s response, pages 10-11).
	These arguments are not persuasive because they rely on limitation not present in the current claim language.  
	First, the instant claim language does not require the capability of accounting for different types of machine learning algorithms. Instead, the claim only requires a single instance of “an indication” where this indication is of whether the type “is a neural network or a non-neural network.” Note that “a neural network or a non-neural network” is an alternate expression reciting two possibilities, but the “indication” itself can only be one of those two possibilities, and the claim only requires a single indication to be present. Therefore, the claim limitation of “an indication of whether a type of a second machine learning algorithm is a neural network or a non-neural network” is met if there is an indication indicates that the type is a neural network.
	 Applicant’s observation that Wang only uses a “single, fixed type of machine learning algorithm” (as characterized in the quoted remarks above) does not reflect any distinction that is present in the current claim language. The instant claim does not require a system that has the ability to handle both the “neural network” type and a “non-neural network” type. For example, the current claim language does not require the system to be capable of assigning different values to a feature (variable) that differs depending on whether the given machine learning algorithm is a neural network or a non-neural network, nor does the claim language require a feature that is, in the context of the system’s operations, capable of having different values respectively indicating a neural network and a non-neural network. Instead, the current claim language only requires the presence of a single indication of a type that happens to be either a neural network or a non-neural network. Furthermore, the context of the alternate expression “a neural network or a non-neural network” is an “indication,” which can be a fixed piece of information. Thus, “whether…or…” is also not interpreted as reciting different capabilities under different contingencies. Therefore, the fact that Wang only uses a “single, fixed type of machine learning algorithm” does not reflect a distinction of the claims over Wang, since the claim language of “an indication of whether a type of a second machine learning algorithm is a neural network or a non-neural network” does not require the type to be variable.
Therefore, in response to applicant's argument that the references fail to show certain features of the invention, it is noted that the features upon which applicant relies (i.e., the ability to account for both neural network and non-neural network types) are not recited in the rejected claim(s). Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).
	In regards to applicant’s arguments that the Wang has “no teaching of using the type of the machine learning model as a feature that is input to another machine learning model” (as quoted above), this argument is not persuasive because the claim does not recite the limitation of using the type of machine learning model as an input “feature.” Instead, the claim recites an “indication” of the machine learning type included in a set of features. Here, an “indication” by itself is not a feature, and broadly reads on concepts such as an indication in the form of information that can be understood as having a certain meaning. In Table 2 of Wang, for example, the features such as “Kernel Size,” “Channel In,” and “Channel Out” are an indication that the type of machine learning model is a neural network, because Kernels and Channels are aspects of a convolutional neural network.
	The Examiner also submits that different CNN layer configurations (e.g., configurations with respect to the Kernel, channel, and other features listed in Table II) can be regarded as different respective types of CNNs, even though each type is a neural network. The claim language does not specifically define what “type” is other than the limitation that the type must belong to the category of a neural network or a non-neural network. 
	Therefore, the instant claim language lacks the precision to distinguish over Wang, and the claims remain rejected over the previously applied references. To advance prosecution, applicant could amend the claim to more precisely articulate the intended distinctions over the cited art. Applicant is also invited to review the additional references cited in the conclusion section of this action which are pertinent to techniques that take into consideration model architecture for analysis of machine learning tasks. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

1.	Claims 1-3, 5, 7, 9-12, 14-17 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Faulhaber et al. (US 2019/0155633 A1) (“Faulhaber”) in view of Zhang et al., “Finding the Big Data Sweet Spot: Towards Automatically Recommending Configurations for Hadoop Clusters on Docker Containers,” 2015 IEEE International Conference on Cloud Engineering (“Zhang”) and Wang et al., “Toward Accurate Platform-Aware Performance Modeling for Deep Neural Networks,” arXiv:2012.00211v1 [cs.LG] 1 Dec 2020 (“Wang”).
As to claim 1, Faulhaber teaches a system comprising:
at least one hardware processor; [[0187]: “Methods and tasks described herein may be performed and fully automated by a computer system. The computer system may, in some cases, include multiple distinct computers or computing devices…Each such computing device typically includes a processor (or multiple processors) that executes program instructions or modules stored in a memory or other non-transitory computer-readable storage medium or device (e.g., solid state storage devices, disk drives, etc.).”] and 
a computer-readable medium storing instructions that, when executed by the at least one hardware processor, cause the at least one hardware processor to perform operations comprising: [[0187]: “Each such computing device typically includes a processor (or multiple processors) that executes program instructions or modules stored in a memory or other non-transitory computer-readable storage medium or device (e.g., solid state storage devices, disk drives, etc.).”] […]
receiving, at an application server in a cloud environment, a request to generate a first inference model for a first entity of a plurality of entities corresponding to the cloud environment; [[0029]: “The user devices 102 can interact with the model training system 120 via frontend 129 of the model training system 120. For example, a user device 102 can provide a training request to the frontend 129…” The training request may include the model’s algorithm and code (see [0031]: “the user device 102 may provide, in the training request, an algorithm written in any programming language”). The request may include the hyperparameters of the model, and also is a request to perform learning, as described in see [0040]: “The virtual machine instance 122 executes some or all of the executable instructions according to the hyperparameter values included in the training request. As an illustrative example, the virtual machine instance 122 trains a machine learning model by identifying values…” Therefore, the request for training to be performed constitutes “a request to generate a first inference model” since training a model constitutes generating a trained model. The model is an inference model because it generates predictions (see [0054]: “Execution of the code 156 results in the generation of outputs (e.g., predicted results)”). The request is “for a first entity of a plurality of entities” because the service is for a plurality of users and the user is associated with the training device, as described in [0020]: “users can create or utilize relatively simple containers…, where the containers include code for how a machine learning model is to be trained and/or executed”; [0027]: “The operating environment 100 includes end user devices 102”; [0070]: “the frontend 129 may determine whether the user associated with the training request is authorized to initiate the training process.” The service is implemented at an “application server in a cloud environment,” as described in [0069]: “For example, the model training system 120 and/or the model hosting system 140 or various constituents thereof could implement various Web services components, hosted or “cloud” computing environments.”] 
in response to the receiving […] [[0070]: “The frontend 129 processes all training requests received from user devices 102 and provisions virtual machine instances 122.”] 
generating a container […] [A container is created [0038]: “the model training system 120 uses one or more container images included in a training request (or a container image retrieved from the container data store 170 in response to a received training request) to create and initialize a ML training container 130 in a virtual machine instance 122.” [0034]: “Generally, the ML training containers 130 are logical units created within a virtual machine instance using the resources available on that instance, and can be utilized to isolate execution of a task from other processes (e.g., task executions) occurring in the instance.”] and
causing a first version of the first inference model to be generated and trained using the container, the second machine learning algorithm, and a second set of training data. [[0040]: Thus, the virtual machine instance 122 can execute the executable instructions to initiate a machine learning model training process, where the training process is run using the hyperparameter values included in the training request…Execution of the executable instructions can include the virtual machine instance 122 applying the training data retrieved by the model training system 120 as input parameters to some or all of the instructions being executed.” As noted above, the algorithm is defined based on the request, and corresponds to a “second machine learning algorithm.” See [0031]: “the user device 102 may provide, in the training request, an algorithm written in any programming language.” In general, training data (corresponding to “a second set of training data”) is stored in training data store 160, as described in [0072]: “The training data store 160 stores training data and/or evaluation data. The training data can be data used to train machine learning models and evaluation data can be data used to evaluate the performance of machine learning models.” A particular subset is retrieved for training. See also [0039]: “Prior to beginning the training process, in some embodiments, the model training system 120 retrieves training data from the location indicated in the training request.”]
Faulhaber does not explicitly teach 
(1)	the obtaining of a container assignment machine learning model by:
obtaining a dynamic weighted container assignment machine learned model trained via training using a first machine learning algorithm, the training comprising obtaining a first set of training data and passing the first set of training data through the machine learning algorithm to learn a coefficient for each of a plurality of features of the training data, the dynamic weighted container assignment machine learned model being trained to output a container configuration for a combination of an entity and an inference machine learned model, the container configuration including a category indicating a count of each of a plurality of computing resources to be assigned to the combination of the entity and the inference model.

(2) 	the limitation of: “inputting a set of features corresponding to the entity and to the first inference model to the dynamic weighted container assignment machine learned model to obtain a container configuration for the first inference model” and the related limitation of the container being generated “based on the obtained container configuration”; and
(3)	“the set of features comprising an indication of whether a type of a second machine learning algorithm is a neural network or a non-neural network, the container configuration obtained by the dynamic weighted container assignment machine learned model being based on the type of the second machine learning algorithm.”
Zhang teaches limitation (2) above and part of limitation (1) listed above. In general, Zhang pertains to recommending configurations for Hadoop Clusters on Docker containers (see title). 
In particular, Zhang teaches “obtaining a dynamic weighted container assignment machine learned model trained via training using a first machine learning algorithm, [§ I, paragraph 4: “We design a lightweight algorithm based on customized k-nearest neighbor to efficiently recommend Hadoop and container configurations prior to job execution.” Note that “k-nearest neighbor” is a machine learning algorithm as this term is used in the specification (see, e.g., paragraph 66 of this application’s specification). The instant model is trained using historical data as described in § II, paragraph 2: “Lets denote a set of past jobs {J1, …, JN} where each job Ja is associated with three vectors…A job feature vector…A job configuration vector…A job performance vector.” The model’s operations are described in § II, paragraph 3: “We identify the k-nearest neighbors for the new job to form a group Gx …that have k past jobs most similar to Jx in terms of their job feature vectors, with k being a tunable variable.” Note that in the context of a kNN classifier, the inclusion of the past data to which a new data point is compared means that the classifier is “learned” and “trained.” The container is “weighted” by the CPU shares, as listed in Table III (“CPU shares (relative weight)”). The container is “dynamic” because it is part of a cloud computing platform. See § I, paragraph 1: “cloud computing allows such analytics to harness compute, network and storage resources in an inexpensive, dynamic manner that were not possible before.” Furthermore, “dynamic” is already taught in the context of Faulhaber, wherein containers are dynamic in that containers are generated and removed (see Faulhaber, [0114] (“the original ML training container is replaced”); [0115] (“modify the original ML training container”)). Therefore, when the model of Zhang is applied to the context of Faulhaber, the containers being assigned are also dynamic in the manner used in Zhang.] […], the dynamic weighted container assignment machine learned model being trained to output a container configuration for a combination of an entity and an inference machine learned model, [§ I, paragraph 4: “We design a lightweight algorithm based on customized k-nearest neighbor to efficiently recommend Hadoop and container configurations prior to job execution.” Note that the limitation of “for a combination of an entity and an inference machine learned model” is already taught by the context of the primary reference Faulhaber. Therefore, when the method of Zhang is applied to Faulhaber, the container configuration would be “for an entity and an inference machine learned model”] the container configuration including a category indicating a count of each of a plurality of computing resources to be assigned to the combination of the entity and the inference model; [§ III, last paragraph: “Tables II and III combine to instantiate the configuration vector space… Table III summarizes key container parameters specific to Docker, of which “lxc-conf” and “storage-opt” deserve special mentions. …” As shown in Table III, the configuration parameters include counts of “CPU shares (relative weight)” and “container memory limit” which are a plurality of computing resources (CPU and memory). The limitation of “the combination of the entity and the inference model” is already taught by Faulhaber, and Zhang is applicable to this context, as described above.]
Zhang further teaches: “inputting a set of features corresponding to the entity and to the first inference model to the dynamic weighted container assignment machine learned model to obtain a container configuration for the first inference model” [§ II, paragraph 3: “Let Jx denote a new incoming job… Our goal is to determine the job configuration vector Cx for the new job such that Px is desirable. We achieve this by solving two sub problems: …We identify the k-nearest neighbors for the new job to form a group Gx …that have k past jobs most similar to Jx in terms of their job feature vectors, with k being a tunable variable… We rank the performance vectors associated with each job in Gx and return the configuration vectors corresponding to the top k’ performance vectors that meet a performance threshold PT.” Note that a “job feature vector” corresponds to “a set of features corresponding to the entity and to the first inference model,” noting that the entity and the first inference model are already taught by Faulhaber, and Zhang is applicable to this context, since Zhang is applicable to computational jobs in general, including those that use machine learning algorithms (see § IV, paragraph 2 for examples).] and the related limitation of the container being generated “based on the obtained container configuration.” [§ IV, paragraph 3: “We report the average execution time over 3 runs in Figure 2, comparing configurations recommended using our approach with the default configurations shipped with YARN.” That is, the recommended configurations are used to generate actual containers.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of Faulhaber with the teachings of Zhang by implementing the model and recommendation technique of Zhang to determine a configuration of a container used in Zhang, so as to arrive at the limitations of “obtaining a dynamic weighted container assignment machine learned model trained via training using a first machine learning algorithm,” “the dynamic weighted container assignment machine learned model being trained to output a container configuration for a combination of an entity and an inference machine learned model the container configuration including a category indicating a count of each of a plurality of computing resources to be assigned to the combination of the entity and the inference model” and “inputting a set of features corresponding to the entity and to the first inference model to the dynamic weighted container assignment machine learned model to obtain a container configuration for the first inference model” such that the container is generated “based on the obtained container configuration.” The motivation is to apply a model that automatically determines a configuration for a container that enables a given job to be performed well, as suggested by Zhang (abstract: “We propose to alleviate this issue with an engine that recommends configurations for a newly submitted analytics job in an intelligent and timely manner. The engine…finds desirable configurations from similar past jobs that have performed well.”).
The combination of references thus far does not teach the limitations of “the training comprising obtaining a first set of training data and passing the first set of training data through the machine learning algorithm to learn a coefficient for each of a plurality of features of the training data” and “the set of features comprising an indication of whether a type of a second machine learning algorithm is a neural network or a non-neural network, the container configuration obtained by the dynamic weighted container assignment machine learned model being based on the type of the second machine learning algorithm.”
Wang teaches the above limitations. Wang generally pertains to machine learning applications, specifically prediction of the “inference time and training time of… neural networks” (abstract). 
In particular, Wang teaches “the training comprising obtaining a first set of training data” [§ IV, paragraph 1: “…acquire a reasonably representative training data set in practice in Section IV-B. It took us two weeks to obtain a training data set of 100,000 samples for each type of layer, and the prediction models described in the previous section can be built within a day with the data set.”] “and passing the first set of training data through the machine learning algorithm to learn a coefficient for each of a plurality of features of the training data.” [§ III.E (“Convolutional Regression Network”): To predict the three phases time for the inference/training time intervals of each layer… we propose a convolutional neural network, named as PerNetV2… For unseen device… we chose to use PerNet architecture for the quicker inference results.” That is Wang teaches training a convolutional neural regression model, which is either PerfNet or PerNetV2. Since these are neural network models, it is understood by one of ordinary skill in the art that the “training” process learns weight coefficients or the layers shown in FIG. 4. Regarding the limitation of “plurality of features,’ § IV.A teaches: “Table II shows the software and hardware features used for training our prediction models.” Note that because training affects the weights of the whole neural network, the weight coefficients are for each the input features of the model.]
Wang further teaches “the set of features comprising an indication of whether a type of a second machine learning algorithm is a neural network or a non-neural network,” [§ IV.A: “Table II shows the software and hardware features used for training our prediction models.” As shown in Table 2, the features such as “Kernel Size,” “Channel In,” and “Channel Out” indicate that the type of machine learning model is a neural network, because Kernels and Channels are features of a convolutional neural network. The Examiner notes that the instant claim language only requires the use of a single instance of an “indication,” and that “a neural network or a non-neural network” is an alternate expression that is met by an indication of either one of the two alternatives. The claim does not require a system capability of determining both alternatives of “neural network” and “non-neural network,” since such a capability for determining both alternatives is not recited as a limitation of the instant claim. Instead, the claim only requires the presence of one indication of one of the alternatives. Note that different CNN layer configurations (e.g., configurations with respect to the Kernel, channel, and other features listed in Table II) can be regarded as different respective types of CNNs, even though each type is a neural network. The claim language does not specifically define what “type” is other than the limitation that the type must belong to the category of a neural network or a non-neural network.] “the container configuration obtained by the dynamic weighted container assignment machine learned model being based on the type of the second machine learning algorithm” [The feature of “the container configuration” is already taught by the existing combination of references. The limitation of “based on…” is taught by Wang, § IV.A, paragraph 2: “With a total of 21 features, we are able to predict the inference or training performance for most CNN’s on desktop computers and servers with or without GPU’s.” That is, training performance for a hardware is computed based on the fact that the model is a CNN and is more specifically based on the type of CNN (the architectural features of the CNN) as represented by the features of Table II.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of the references combined thus far with the teaching of Wang by implementing the dynamic weighed container machine learned model to further comprise a neural network that predicts training time of a model based on model features, as taught in the technique of Wang, such that the training comprises “obtaining a first set of training data and passing the first set of training data through the machine learning algorithm to learn a coefficient for each of a plurality of features of the training data” and “the set of features comprising an indication of whether a type of a second machine learning algorithm is a neural network or a non-neural network, the container configuration obtained by the dynamic weighted container assignment machine learned model being based on the type of the second machine learning algorithm,” so as to arrive at the claimed invention. The motivation would have been to predict training time of a machine learning model to be trained, which is relevant to configuring the platform that executes the model to be trained because it is relevant to the selection of a suitable platform (see Wang, § VI, paragraph 1: “Our experimental results suggest that the proposed platform-aware performance model delivers practical estimates which are useful to application developers and system designers in choosing suitable neural networks and/or proper platforms.”) and to do so by taking into account features that indicate execution performance of a neural network, as suggested by Wang (§ IV: “Section IV-A describes the features which may impact the performance in a GPU-based system.”). The Examiner notes that while Wang does not explicitly refer to cloud computing environments, performance prediction is relevant to computing platforms in general, including the one in Faulhaber which includes accelerators such as GPUs (see Faulhaber, [0029]: “the computing machine on which to train a machine learning model…e.g., a graphical processing unit (GPU) instance type").

As to claim 2, the combination of Faulhaber, Zhang, and Wang teaches the system of claim 1, wherein the first machine learning algorithm is a clustering algorithm. [§ I, paragraph 4: “We design a lightweight algorithm based on customized k-nearest neighbor to efficiently recommend Hadoop and container configurations prior to job execution.” Note that “k-nearest neighbor” is a “clustering algorithm” in the manner in which the latter term is used in the specification (see, e.g., paragraph 66 of this application’s specification).]

As to claim 3, the combination of Faulhaber, Zhang, and Wang teaches the system of claim 2, wherein the clustering algorithm is a k-nearest neighbor algorithm. [§ I, paragraph 4: “We design a lightweight algorithm based on customized k-nearest neighbor to efficiently recommend Hadoop and container configurations prior to job execution.” Note that “k-nearest neighbor” is a “clustering algorithm” in the manner in which the latter term is used in the specification (see, e.g., paragraph 66 of this application’s specification).]

As to claim 5, the combination of Faulhaber, Zhang, and Wang teaches the system of claim 1, wherein the operations further comprise:
receiving, at the application server, training data parameters for the first inference model and wherein the causing the first version of the first inference model to be generated and trained further includes filtering the second set of training data based on the training data parameters. [Faulhaber, [0036]: “For example, the virtual machine instance 122 can identify a type of training data indicated by the training request and select a machine learning model to train (e.g., execute the executable instructions that represent an algorithm that defines the selected machine learning model) that corresponds with the identified type of training data.” Faulhaber, [0039]: “Prior to beginning the training process, in some embodiments, the model training system 120 retrieves training data from the location indicated in the training request. For example, the location indicated in the training request can be a location in the training data store 160.” That is, the training request specifies a training data parameter in the form of a location for the training data to be used, and the training system obtains subset of the set of training data stored in training data store 160. Selecting this subset constitutes filtering the base set.]

As to claim 7, the combination of Faulhaber, Zhang, and Wang teaches the system of claim 1, as set forth above.
Wang further teaches “wherein the inputting a set of features is only performed once a threshold amount of historic data of metadata about model generation runs is gathered.” [§ I, paragraph 2, items 1) through 4) teaches that the application stage takes place only after the model has been finalized (see FIG. 1: “Finalized Model”), which in turn takes place only after the training stage has been completed. Here, the “application stage” is analogous to the process in which the “inputting” is performed. Furthermore, the training of the model during the training stage reads on the limitation of gathering a threshold amount of historic data (in the form of training data) about model generation runs. The data gathering process is described in § IV.B: “We develop a tool to obtain the training data set automatically. The tool first generates a set of the microbenchmarks by varying the features randomly and then performs the microbenchmarks on the target platforms in parallel. Finally, the time intervals measured from the Tensorflow Profiler are analyzed to extract the preprocessing time (Tpre), the execution time (Texe), and the post-processing time (Tpost), as mentioned in the previous section.” That is, the “features” and the extracted times correspond to “metadata about model generation runs.” The limitation of “threshold” is met because the training process uses the full training set, which is a threshold. See § IV.B, second bullet point: “Thus, we reduce the number of samples to 30,000 to keep the profiling process within the two weeks period for the training time predictor.”]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have further combined the teachings of the Wang and the other references combined thus far by implementing inputting step such that “the inputting a set of features is only performed once a threshold amount of historic data of metadata about model generation runs is gathered.” The motivation for doing so is the same as the motivation given for the teachings of Wang in the rejection of the parent independent claim, since the parts of Wang cited above for this dependent claim are part of the techniques of Wang discussed in the rejection of the parent independent claim.

As to claim 9, the combination of Faulhaber, Zhang, and Wang teaches the system of claim 1, as set forth above.
Wang further teaches “wherein the set of features corresponding to the entity and to the first inference model includes information about a number of unique features in the second set of training data.” [Table II teaches the feature of “Matrix Size,” described as “The dimensions of the input data.” Note that in the instant context, the number of dimensions corresponds to the number of unique features, since each dimension is a unique feature. Thus, the dimensionality of the input of the input layer of the model being evaluated (see § 5.5: “in the case of training a full VGG16 DNN with the SGD optimizer”) is the number of features in the training set used to train that model.]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have further combined the teachings of the Wang and the other references combined thus far by implementing the set of features to include “information about a number of unique features in the second set of training data.” The motivation would have been to take into account features that indicate execution performance of a neural network, as suggested by Wang (§ IV: “Section IV-A describes the features which may impact the performance in a GPU-based system.”). 

As to claims 10-12 and 14, these claims are directed to a method comprising the same or substantially the same operations as those recited in claims 1-3 and 5. Therefore, the rejections made to claims 1-3 and 5 are applied to claims 10-12 and 14, respectively.

As to claims 15-17 and 19, these claims are directed to a machine-readable medium for performing the same or substantially the same operations as those recited in claims 1-3 and 5. Therefore, the rejections made to claims 1-3 and 5 are applied to claims 15-17 and 19, respectively.
	Furthermore, Faulhaber teaches “a non-transitory machine-readable medium storing instructions which, when executed by one or more processors, cause the one or more processors to perform operations” [[0187]: “Each such computing device typically includes a processor (or multiple processors) that executes program instructions or modules stored in a memory or other non-transitory computer-readable storage medium or device (e.g., solid state storage devices, disk drives, etc.).”].

2.	Claims 4, 13, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Faulhaber in view of Zhang and Wang, and further in view of Baghani et al. (US 11,397,794 B1) (“Baghani”).
As to claim 4, the combination of Faulhaber, Zhang, and Wang teaches the system of claim 1, but does not explicitly teach the further limitation that “wherein the first entity is a group of users.”
Baghani, which generally pertains to access to cloud-based services (see col. 1, lines 6-20), teaches “wherein the first entity is a group of users” [Col. 13, lines 53-58: “In some embodiments, a multi-user account may be created to for access a computing system. For example, in the service provider network 230 described in connection with FIG. 2, an account may be created for a group of users (e.g., all users within a single company or department). In some embodiments, individual users may also be created and individually authenticated.” That is, the account is for accessing a computer system, analogous to use of the service described in Faulhaber. See col. 14, lines 36-40: “In this example, the account has access to two server instances 342 and 344 hosted via a virtual machine service 340, which may be the virtual machine service 275 of FIG. 2.”]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of the references combined thus far with the teachings of Baghani by implementing the first entity as a group of users. The motivation would have been to provide shared access for multiple users in a single company or department, as suggested by Baghani (col. 13, lines 53-58 “…a group of users…e.g., all users within a single company or department…”).


	As to claims 13 and 18, the further limitations recited in these claims are the same or substantially the same as those of claim 4. Therefore, the rejection made to claim 4 is applied to claims 13 and 18.

3.	Claims 6 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Faulhaber in view of Zhang and Wang, and further in view of Croteau et al. (US 2021/0117217 A1) (“Croteau”).
As to claim 6, the combination of Faulhaber, Zhang, and Wang teaches the system of claim 1, further comprising repeating the […] and generating for a subsequent version of the first inference model, causing a […] to be used for retraining of the first inference model than was used in a prior training of the first inference model. [Faulhaber, [0087]: “The user device 102 can transmit the modified container image as part of a modification request to modify the machine learning model being trained. In response, the virtual machine instance 122 stops execution of the code stored in the original ML training container formed from the original container image at (4) … The virtual machine instance 122 can then form a modified ML training container from the modified container image”].
The combination of references thus far does not teach the limitation of repeating the “inputting” and the limitation that the modified container has a “different container configuration.” 
Croteau teaches or suggests repeating the “inputting” and the modified container having a “different container configuration.” [Abstract: “The tuning engine accesses the application metrics and a rule that specifies tuning of resource configuration for the container. The rule combines variables in the metrics to determine whether an update should be applied to the container. The tuning engine determines a new resource configuration for the tunable container and updates the configuration state information for the container according to the new configuration.” See also [0033] for similar descriptions. That is, determining a new resource configuration based on application metrics is analogous to the “inputting” step of the instant claim, and the use of a “new resource configuration” corresponds to a “different container configuration.”]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of the references combined thus far with the teachings of Croteau by modifying Faulhaber, as modified thus far, to repeat the “inputting” and such that the modified container has a “different container configuration.” The motivation would have been to update container configurations in response to changes to customer applications, as suggested by Croteau (see [0004]: “changes to customer applications have resulted in discrepancies between versions among users running in different containers in pods. Consequently, a need exists for tuning containers in pods, in a high availability environment that runs two or more pods to implement a service, while the containers are running.”).

As to claim 20, the further limitations recited in this claim are the same or substantially the same as those of claim 6. Therefore, the rejection made to claim 6 is applied to claim 20.

4.	Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Faulhaber in view of Zhang and Wang, and further in view of Wetherbee et al. (US 2022/0284351 A1) (“Wetherbee”).
As to claim 8, the combination of Faulhaber, Zhang, and Wang teaches the system of claim 1, as set forth above.
Wetherbee, which generally relates to the use of “cloud containers” ([0002]) for machine learning software applications” ([0004]), teaches “wherein the set of features corresponding to the entity and to the first inference model includes information about a volume of the second set of training data.” [[0021]: “Further, the business enterprise's expectation on accuracy of the machine learning prognostics also influences the configuration requirements for the cloud container—the number of training vectors directly impacts memory requirements for the container in addition to adding to compute cost overhead for training of the machine learning model.” See also [0038]: “The number of training vectors may also be referred to as the size of the training data set.”]
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teachings of the references combined thus far with the teachings of Wetherbee by implementing the set of features to include “information about a volume of the second set of training data.” The motivation would have been to take into account features that indicate the memory requirements for a container, as suggested by Wetherbee (see part quoted above).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. THe following references depic the state of the art and additional techniques related to the feature of a set of feature dependent on model type as disclosed in applicant’s specification.

Gao et al., US11635988B1 teaches determining optimal computing resources (specifically, the number of threads) based on a type of computing task, which includes a type of model. See col. 6, lines 29-60: “Thus, the first indicator describes a computing task that may be to train a machine learning model type… A machine learning model type may be selected from “K-Clustering”, “Decision Tree”, “Factorization Machine”, “Forest”, “Gradient Boosting”, “Neural Network”, ““Support Vector Machine”, etc.”

Kobayashi et al., US20180018587A1 teaches prediction of runtime and performance (see FIG. 1) for a variety of machine learning algorithms based on model type. See [0073]: “A combination of the type of a machine learning algorithm and the values of hyperparameters is sometimes referred to as a configuration.”

Yeung et al., “Towards GPU Utilization Prediction for Cloud Deep Learning,” HotCloud'20: Proceedings of the 12th USENIX Conference on Hot Topics in Cloud Computing (2020) teaches the use of features for resource prediction for machine learning workloads.

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to YAO DAVID HUANG whose telephone number is (571)270-1764. The examiner can normally be reached Monday - Friday 9:00 am - 5:30 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached at (571) 270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/Y.D.H./Examiner, Art Unit 2124                                                                                                                                                                                                        

/MIRANDA M HUANG/Supervisory Patent Examiner, Art Unit 2124