Patent Application 17756957 - FEDERATED MIXTURE MODELS

Title: FEDERATED MIXTURE MODELS

Application Information

Invention Title: FEDERATED MIXTURE MODELS
Application Number: 17756957
Submission Date: 2025-05-13T00:00:00.000Z
Effective Filing Date: 2022-06-06T00:00:00.000Z
Filing Date: 2022-06-06T00:00:00.000Z
National Class: 706
National Sub-Class: 012000
Examiner Employee Number: 81690
Art Unit: 2168
Tech Center: 2100

Rejection Summary

102 Rejections: 1
103 Rejections: 3

Cited Patents

The following patents were cited in the rejection:

Office Action Text

DETAILED ACTION
1. The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Priority
2. Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.
Information Disclosure Statement
3. The information disclosure statements (IDS) submitted on 06/06/2022, 11/08/2024, and 02/05/2025 have been received, entered into the record, and considered. The submission is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statements are being considered by the examiner.
Response to Amendment
4. Receipt of Applicant’s Preliminary Amendment filed on 06/06/2022 is acknowledged. The preliminary amendment includes the amending of the specification.
Claim Objections
5. Independent claim 1 is objected to because of the following informalities: The phrase “receiving, at an processing device” is grammatically incoherent and should be replaced with “receiving, at a processing device”. Appropriate correction is required.
Dependent claims 2-7 are objected to for incorporating the deficiencies of independent claim 1.
Independent claim 9 is objected to because of the following informalities: The phrase “process data stored locally on processing device” is grammatically incoherent and should be replaced with “process data stored locally on a processing device”. Appropriate correction is required.
Dependent claims 10-15 are objected to for incorporating the deficiencies of independent claim 9.
Claim Rejections - 35 USC § 101
6. 35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
7. Claims (1-8), (9-15), (16-23), and (24-30) are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more.
Under the 2019 PEG, when considering subject matter eligibility under 35 U.S.C. § 101, it must be determined whether the claim is directed to one of the four statutory categories of invention, i.e., process, machine, manufacture, or composition of matter (step 1). If the claim does fall within one of the statutory categories, it must then be determined whether the claim is directed to a judicial exception (i.e., law of nature, natural phenomenon, and abstract idea) (step 2A prong 1), and if so, it must additionally be determined whether the claim is integrated into a practical application (step 2A prong 2). If an abstract idea is present in the claim without integration into a practical application, any element or combination of elements in the claim must be sufficient to ensure that the claim amounts to significantly more than the abstract idea itself (step 2B).
In the instant case, claims (1-8), (9-15), (16-23), and (24-30) are directed to a method, processing device, method, and processing device respectively. Thus, each of the claims falls within one of the four statutory categories. However, the claims also fall within the judicial exception of an abstract idea.
Under Step 2A Prong 1, the test is to identify whether the claims are “directed to” a judicial exception. The examiner notes that the claimed invention is directed to an abstract idea in that the instant application is directed to mental processes, specifically updating a model.
The examiner further notes that claims (1-8), (9-15), (16-23), and (24-30) are directed to a method, processing device, method, and processing device for updating a model which is similar to themes defined above of method of mental processes such as performing the updating a model, and is similar to the abstract idea identified in the 2019 PEG in grouping “c” in that the claims recite certain methods of mental processes such as performing the updating of a model. The limitations, substantially comprising the body of the claim, recite a process of updating a model. The examiner notes that the claimed invention updates a model. Because the limitations above closely follow the steps in updating a model, and the steps of the claims involve mental processes, the claim recites an abstract idea consistent with the “mental processes” grouping set forth in the 2019 PEG.
Claim 1:
A method of processing data, comprising: receiving, at an processing device, a set of global parameters for each machine learning model of a plurality of machine learning models;
for each respective machine learning model of the plurality of machine learning models: processing, at the processing device, data stored locally on the processing device with respective machine learning model according to the set of global parameters to generate a machine learning model output;
receiving, at the processing device, user feedback regarding the machine learning model output;
performing, at the processing device, an optimization of the respective machine learning model based on the machine learning model output and the user feedback associated with machine learning model output to generate locally updated machine learning model parameters; and
sending the locally updated machine learning model parameters to a remote processing device; and
receiving, from the remote processing device, a set of globally updated machine learning model parameters for each machine learning model of the plurality of machine learning models;
wherein the set of globally updated machine learning model parameters for each respective machine learning model are based at least in part on the locally updated machine learning model parameters.
These limitations, as drafted, is an apparatus that, under its broadest reasonable interpretation, covers the performance of mental processes specifically updating a model. Updating a model has long before the modern computer was invented, and continues to be predominantly a product of human endeavor. The instant application is directed to updating a model. Additionally, the claimed processing of data via a model is can be performed by a human via their mind and/or pen & paper. Furthermore, the claimed optimizing of a model can be performed by a human via their mind and/or pen & paper. Additionally, the claimed global parameters being based off of locally updated parameters can be performed by a human via their mind and/or pen & paper. Because the limitations above closely follow the steps of updating a model, and the steps involved human judgments, observations and evaluations that can be practically or reasonably performed in the human mind and/or pen & paper, the claim recites an abstract idea consistent with the “mental process” grouping set forth in the 2019 PEG.
The mere nominal recitation of generic computing components such as a processing device and remote processing device does not take the claim out of certain methods of mental processes grouping. Therefore, the limitation is directed to an abstract idea.
If the claims are directed toward the judicial exception of an abstract idea, it must then be determined under Step 2A Prong 2 whether the judicial exception is integrated into a practical application. The Examiner notes that considerations under Step 2A Prong 2 comprise most the consideration previously evaluated in the context of Step 2B. The Examiner submits that the considerations discussed previously determined that the claim does not recite “significantly more” at Step 2B would be evaluated the same under Step 2A Prong 1 and result in the determination that the claim does not integrate the abstract idea into a practical application.
The instant application fails to integrate the judicial exception into a practical application because the instant application merely recites words “apply it” (or an equivalent) with the judicial exception or merely includes instructions to implement an abstract idea. The instant application is directed to an apparatus instructing the reader to implement the identified apparatus of mental processes of updating a model. The elements of the claim do not themselves amount to an improvement to the computer, to a technology or another technical field. Moreover, the receiving of a set of global parameters is a data gathering operation that is an insignificant data gathering operation that does not integrate the abstract idea into a practical application. Furthermore, the receiving of user feedback is a data gathering operation that is an insignificant data gathering operation that does not integrate the abstract idea into a practical application. Additionally, the claimed sending of local parameters is a data transmission operation that is an insignificant data gathering operation that does not integrate the abstract idea into a practical application (See also Section 2106.05(d)(II) of the MPEP). Moreover, the receiving of a set of updated global parameters is a data gathering operation that is an insignificant data gathering operation that does not integrate the abstract idea into a practical application.
Here, the claim elements entirely comprise the abstract idea, leaving little if any aspects of the claim for further consideration under Step 2A Prong 2. In short, the claims have failed to integrate a practical application (see at least 84 Fed. Reg. (4) at 55). Under the 2019 PEG, this supports the conclusion that the claim is directed to an abstract idea, and the analysis proceeds to Step 2B.
While many considerations in Step 2A need not be reevaluated in Step 2B because the outcome will be the same. Here, on the basis of the additional elements other than the abstract idea, considered individually and in combination as discussed above, the Examiner respectfully submits that the claim 1 does not contain any additional elements that individually or as an ordered combination amount to an inventive concept and the claims are ineligible.
With respect to the dependent claims do not recite anything that is found to render the abstract idea as being transformed into a patent eligible invention. The dependent claims are merely reciting further embellishments of the abstract idea and do not claim anything that amounts to significantly more than the abstract idea itself.
With respect to the dependent claims, they have been considered and are not found to be reciting anything that amounts to being significantly more than the abstract idea. Claims 2-8 are directed to further embellishments of the central theme of the abstract idea in that the claims are directed to further embellishments of the updating a model of the steps of claim 1 and do not amount to significantly more.
Specifically, claim 2 is directed towards the performing of a number of optimizations which can be performed by the human mind and/or pen & paper and does not amount to significantly more.
Furthermore, claim 3 is directed towards the updated parameters being based off of another local parameter set which can be performed by the human mind and/or pen & paper and does not amount to significantly more.
Additionally, claim 4 is directed towards the defining of the user feedback which can be performed by the human mind and/or pen & paper and does not amount to significantly more.
Moreover, claim 5 is directed towards the storing of defined data which is an insignificant extra-solution activity that does not meaningfully limit the claim.
Furthermore, claim 6 is directed towards defining the processing device to be one of a smartphone or IOT device that is a generic computing that does not take the claim out of certain methods of mental processes grouping.
Additionally, claim 7 is directed towards the use of a neural processing unit that is a generic computing that does not take the claim out of certain methods of mental processes grouping.
Moreover, claim 8 is directed towards the use of a neural processing unit that is a generic computing that does not take the claim out of certain methods of mental processes grouping.
Claim 9:
A processing device, comprising: a memory comprising computer-executable instructions;
one or more processors configured to execute the computer-executable instructions and cause the processing device to: receive a set of global parameters for each machine learning model of a plurality of machine learning models;
for each respective machine learning model of the plurality of machine learning models: process data stored locally on processing device with respective machine learning model according to the set of global parameters to generate a machine learning model output;
receive user feedback regarding machine learning model output;
perform an optimization of the respective machine learning model based on the machine learning model output and the user feedback associated with machine learning model output to generate locally updated machine learning model parameters; and
send the locally updated machine learning model parameters to a remote processing device; and
receive, from the remote processing device, a set of globally updated machine learning model parameters for each machine learning model of the plurality of machine learning models;
wherein the set of globally updated machine learning model parameters for each respective machine learning model are based at least in part on the locally updated machine learning model parameters.
These limitations, as drafted, is an apparatus that, under its broadest reasonable interpretation, covers the performance of mental processes specifically updating a model. Updating a model has long before the modern computer was invented, and continues to be predominantly a product of human endeavor. The instant application is directed to updating a model. Additionally, the claimed processing of data via a model is can be performed by a human via their mind and/or pen & paper. Furthermore, the claimed optimizing of a model can be performed by a human via their mind and/or pen & paper. Additionally, the claimed global parameters being based off of locally updated parameters can be performed by a human via their mind and/or pen & paper. Because the limitations above closely follow the steps of updating a model, and the steps involved human judgments, observations and evaluations that can be practically or reasonably performed in the human mind and/or pen & paper, the claim recites an abstract idea consistent with the “mental process” grouping set forth in the 2019 PEG.
The mere nominal recitation of generic computing components such as a memory, one or more processors, processing device, and remote processing device does not take the claim out of certain methods of mental processes grouping. Therefore, the limitation is directed to an abstract idea.
If the claims are directed toward the judicial exception of an abstract idea, it must then be determined under Step 2A Prong 2 whether the judicial exception is integrated into a practical application. The Examiner notes that considerations under Step 2A Prong 2 comprise most the consideration previously evaluated in the context of Step 2B. The Examiner submits that the considerations discussed previously determined that the claim does not recite “significantly more” at Step 2B would be evaluated the same under Step 2A Prong 1 and result in the determination that the claim does not integrate the abstract idea into a practical application.
The instant application fails to integrate the judicial exception into a practical application because the instant application merely recites words “apply it” (or an equivalent) with the judicial exception or merely includes instructions to implement an abstract idea. The instant application is directed to an apparatus instructing the reader to implement the identified apparatus of mental processes of updating a model. The elements of the claim do not themselves amount to an improvement to the computer, to a technology or another technical field. Moreover, the receiving of a set of global parameters is a data gathering operation that is an insignificant data gathering operation that does not integrate the abstract idea into a practical application. Furthermore, the receiving of user feedback is a data gathering operation that is an insignificant data gathering operation that does not integrate the abstract idea into a practical application. Additionally, the claimed sending of local parameters is a data transmission operation that is an insignificant data gathering operation that does not integrate the abstract idea into a practical application (See also Section 2106.05(d)(II) of the MPEP). Moreover, the receiving of a set of updated global parameters is a data gathering operation that is an insignificant data gathering operation that does not integrate the abstract idea into a practical application.
Here, the claim elements entirely comprise the abstract idea, leaving little if any aspects of the claim for further consideration under Step 2A Prong 2. In short, the claims have failed to integrate a practical application (see at least 84 Fed. Reg. (4) at 55). Under the 2019 PEG, this supports the conclusion that the claim is directed to an abstract idea, and the analysis proceeds to Step 2B.
While many considerations in Step 2A need not be reevaluated in Step 2B because the outcome will be the same. Here, on the basis of the additional elements other than the abstract idea, considered individually and in combination as discussed above, the Examiner respectfully submits that the claim 9 does not contain any additional elements that individually or as an ordered combination amount to an inventive concept and the claims are ineligible.
With respect to the dependent claims do not recite anything that is found to render the abstract idea as being transformed into a patent eligible invention. The dependent claims are merely reciting further embellishments of the abstract idea and do not claim anything that amounts to significantly more than the abstract idea itself.
With respect to the dependent claims, they have been considered and are not found to be reciting anything that amounts to being significantly more than the abstract idea. Claims 10-15 are directed to further embellishments of the central theme of the abstract idea in that the claims are directed to further embellishments of the updating a model of the steps of claim 9 and do not amount to significantly more.
Specifically, claim 10 is directed towards the performing of a number of optimizations which can be performed by the human mind and/or pen & paper and does not amount to significantly more.
Furthermore, claim 11 is directed towards the updated parameters being based off of another local parameter set which can be performed by the human mind and/or pen & paper and does not amount to significantly more.
Additionally, claim 12 is directed towards the defining of the user feedback which can be performed by the human mind and/or pen & paper and does not amount to significantly more.
Moreover, claim 13 is directed towards defining the processing device to be one of a smartphone or IOT device that is a generic computing that does not take the claim out of certain methods of mental processes grouping.
Furthermore, claim 14 is directed towards the use of a neural processing unit that is a generic computing that does not take the claim out of certain methods of mental processes grouping.
Additionally, claim 15 is directed towards the use of a neural processing unit that is a generic computing that does not take the claim out of certain methods of mental processes grouping.
Claim 16:
A method of processing data, comprising: for each respective machine learning model of a plurality of machine learning models: for each respective remote processing device of a plurality of remote processing devices: sending, from a server to the respective remote processing device, an initial set of global model parameters for the respective machine learning model; and
receiving, at the server from the respective remote processing device, an updated set of model parameters for the respective machine learning model; and
performing, at the server, an optimization of the respective machine learning model based on the updated set of model parameters received from each remote processing device of the plurality of remote processing devices to generate an updated set of global model parameters; and
sending, from the server to each remote processing device of the plurality of remote processing devices, the updated set of global model parameters for each machine learning model of the plurality of machine learning models.
These limitations, as drafted, is an apparatus that, under its broadest reasonable interpretation, covers the performance of mental processes specifically updating a model. Updating a model has long before the modern computer was invented, and continues to be predominantly a product of human endeavor. The instant application is directed to updating a model. Additionally, the claimed processing of data via a model is can be performed by a human via their mind and/or pen & paper. Furthermore, the claimed performing optimization of a model can be performed by a human via their mind and/or pen & paper. Because the limitations above closely follow the steps of updating a model, and the steps involved human judgments, observations and evaluations that can be practically or reasonably performed in the human mind and/or pen & paper, the claim recites an abstract idea consistent with the “mental process” grouping set forth in the 2019 PEG.
The mere nominal recitation of generic computing components such as a plurality of remote processing devices and server does not take the claim out of certain methods of mental processes grouping. Therefore, the limitation is directed to an abstract idea.
If the claims are directed toward the judicial exception of an abstract idea, it must then be determined under Step 2A Prong 2 whether the judicial exception is integrated into a practical application. The Examiner notes that considerations under Step 2A Prong 2 comprise most the consideration previously evaluated in the context of Step 2B. The Examiner submits that the considerations discussed previously determined that the claim does not recite “significantly more” at Step 2B would be evaluated the same under Step 2A Prong 1 and result in the determination that the claim does not integrate the abstract idea into a practical application.
The instant application fails to integrate the judicial exception into a practical application because the instant application merely recites words “apply it” (or an equivalent) with the judicial exception or merely includes instructions to implement an abstract idea. The instant application is directed to an apparatus instructing the reader to implement the identified apparatus of mental processes of updating a model. The elements of the claim do not themselves amount to an improvement to the computer, to a technology or another technical field. Moreover, the claimed sending of global parameters is a data transmission operation that is an insignificant data gathering operation that does not integrate the abstract idea into a practical application (See also Section 2106.05(d)(II) of the MPEP). Furthermore, the claimed receiving of updated parameters is a data gathering operation that is an insignificant data gathering operation that does not integrate the abstract idea into a practical application. Additionally, the claimed sending of updated global parameters is a data transmission operation that is an insignificant data gathering operation that does not integrate the abstract idea into a practical application (See also Section 2106.05(d)(II) of the MPEP).
Here, the claim elements entirely comprise the abstract idea, leaving little if any aspects of the claim for further consideration under Step 2A Prong 2. In short, the claims have failed to integrate a practical application (see at least 84 Fed. Reg. (4) at 55). Under the 2019 PEG, this supports the conclusion that the claim is directed to an abstract idea, and the analysis proceeds to Step 2B.
While many considerations in Step 2A need not be reevaluated in Step 2B because the outcome will be the same. Here, on the basis of the additional elements other than the abstract idea, considered individually and in combination as discussed above, the Examiner respectfully submits that the claim 16 does not contain any additional elements that individually or as an ordered combination amount to an inventive concept and the claims are ineligible.
With respect to the dependent claims do not recite anything that is found to render the abstract idea as being transformed into a patent eligible invention. The dependent claims are merely reciting further embellishments of the abstract idea and do not claim anything that amounts to significantly more than the abstract idea itself.
With respect to the dependent claims, they have been considered and are not found to be reciting anything that amounts to being significantly more than the abstract idea. Claims 17-23 are directed to further embellishments of the central theme of the abstract idea in that the claims are directed to further embellishments of the updating a model of the steps of claim 16 and do not amount to significantly more
Specifically, claim 17 is directed towards the computing of a gradient which can be performed by the human mind and/or pen & paper and does not amount to significantly more.
Furthermore, claim 18 is directed towards the computing of a density estimator which can be performed by the human mind and/or pen & paper and does not amount to significantly more.
Additionally, claim 19 is directed towards the determining of weights which can be performed by the human mind and/or pen & paper and does not amount to significantly more.
Moreover, claim 20 is directed towards defining the remote processing devices to be a smartphone that is a generic computing that does not take the claim out of certain methods of mental processes grouping.
Furthermore, claim 21 is directed towards defining the remote processing devices to be an IOT device that is a generic computing that does not take the claim out of certain methods of mental processes grouping.
Additionally, claim 22 is directed towards the defining of the model to be a neural network model which can be performed by the human mind and/or pen & paper and does not amount to significantly more.
Moreover, claim 23 is directed towards the defining of the models to have the same network structure which can be performed by the human mind and/or pen & paper and does not amount to significantly more.
Claim 24:
A processing device, comprising: a memory comprising computer-executable instructions;
one or more processors configured to execute the computer-executable instructions and cause the processing device to: for each respective machine learning model of a plurality of machine learning models: for each respective remote processing device of a plurality of remote processing devices: send to the respective remote processing device, an initial set of global model parameters for the respective machine learning model; and
receive from the respective remote processing device, an updated set of model parameters for the respective machine learning model; and
perform an optimization of the respective machine learning model based on the updated set of model parameters received from each remote processing device of the plurality of remote processing devices to generate an updated set of global model parameters; and
send to each remote processing device of the plurality of remote processing devices the updated set of global model parameters for each machine learning model of the plurality of machine learning models.
These limitations, as drafted, is an apparatus that, under its broadest reasonable interpretation, covers the performance of mental processes specifically updating a model. Updating a model has long before the modern computer was invented, and continues to be predominantly a product of human endeavor. The instant application is directed to updating a model. Additionally, the claimed processing of data via a model is can be performed by a human via their mind and/or pen & paper. Furthermore, the claimed performing optimization of a model can be performed by a human via their mind and/or pen & paper. Because the limitations above closely follow the steps of updating a model, and the steps involved human judgments, observations and evaluations that can be practically or reasonably performed in the human mind and/or pen & paper, the claim recites an abstract idea consistent with the “mental process” grouping set forth in the 2019 PEG.
The mere nominal recitation of generic computing components such as a memory, one or more processors, processing device, and plurality of remote processing devices does not take the claim out of certain methods of mental processes grouping. Therefore, the limitation is directed to an abstract idea.
If the claims are directed toward the judicial exception of an abstract idea, it must then be determined under Step 2A Prong 2 whether the judicial exception is integrated into a practical application. The Examiner notes that considerations under Step 2A Prong 2 comprise most the consideration previously evaluated in the context of Step 2B. The Examiner submits that the considerations discussed previously determined that the claim does not recite “significantly more” at Step 2B would be evaluated the same under Step 2A Prong 1 and result in the determination that the claim does not integrate the abstract idea into a practical application.
The instant application fails to integrate the judicial exception into a practical application because the instant application merely recites words “apply it” (or an equivalent) with the judicial exception or merely includes instructions to implement an abstract idea. The instant application is directed to an apparatus instructing the reader to implement the identified apparatus of mental processes of updating a model. The elements of the claim do not themselves amount to an improvement to the computer, to a technology or another technical field. Moreover, the claimed sending of global parameters is a data transmission operation that is an insignificant data gathering operation that does not integrate the abstract idea into a practical application (See also Section 2106.05(d)(II) of the MPEP). Furthermore, the claimed receiving of updated parameters is a data gathering operation that is an insignificant data gathering operation that does not integrate the abstract idea into a practical application. Additionally, the claimed sending of updated global parameters is a data transmission operation that is an insignificant data gathering operation that does not integrate the abstract idea into a practical application (See also Section 2106.05(d)(II) of the MPEP).
Here, the claim elements entirely comprise the abstract idea, leaving little if any aspects of the claim for further consideration under Step 2A Prong 2. In short, the claims have failed to integrate a practical application (see at least 84 Fed. Reg. (4) at 55). Under the 2019 PEG, this supports the conclusion that the claim is directed to an abstract idea, and the analysis proceeds to Step 2B.
While many considerations in Step 2A need not be reevaluated in Step 2B because the outcome will be the same. Here, on the basis of the additional elements other than the abstract idea, considered individually and in combination as discussed above, the Examiner respectfully submits that the claim 24 does not contain any additional elements that individually or as an ordered combination amount to an inventive concept and the claims are ineligible.
With respect to the dependent claims do not recite anything that is found to render the abstract idea as being transformed into a patent eligible invention. The dependent claims are merely reciting further embellishments of the abstract idea and do not claim anything that amounts to significantly more than the abstract idea itself.
With respect to the dependent claims, they have been considered and are not found to be reciting anything that amounts to being significantly more than the abstract idea. Claims 25-30 are directed to further embellishments of the central theme of the abstract idea in that the claims are directed to further embellishments of the updating a model of the steps of claim 24 and do not amount to significantly more
Specifically, claim 25 is directed towards the computing of a gradient which can be performed by the human mind and/or pen & paper and does not amount to significantly more.
Furthermore, claim 26 is directed towards the computing of a density estimator which can be performed by the human mind and/or pen & paper and does not amount to significantly more.
Additionally, claim 27 is directed towards the determining of weights which can be performed by the human mind and/or pen & paper and does not amount to significantly more.
Moreover, claim 28 is directed towards defining the remote processing devices to be a smartphone that is a generic computing that does not take the claim out of certain methods of mental processes grouping.
Additionally, claim 29 is directed towards the defining of the model to be a neural network model which can be performed by the human mind and/or pen & paper and does not amount to significantly more.
Moreover, claim 30 is directed towards the defining of the models to have the same network structure which can be performed by the human mind and/or pen & paper and does not amount to significantly more.
Claim Rejections - 35 USC § 102
8. In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
9. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
10. Claims 16, 18-24, and 26-30 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by McMahan et al. (U.S. PGPUB 2019/0340534).
11. Regarding claims 16 and 24, McMahan teaches a method and processing device comprising:
A) for each respective machine learning model of a plurality of machine learning models: for each respective remote processing device of a plurality of remote processing devices: sending, from a server to the respective remote processing device, an initial set of global model parameters for the respective machine learning model (Paragraphs 28 and 83); and
B) receiving, at the server from the respective remote processing device, an updated set of model parameters for the respective machine learning model (Paragraphs 28-29 and 88); and
C) performing, at the server, an optimization of the respective machine learning model based on the updated set of model parameters received from each remote processing device of the plurality of remote processing devices to generate an updated set of global model parameters (Paragraphs 29 and 89); and
D) sending, from the server to each remote processing device of the plurality of remote processing devices, the updated set of global model parameters for each machine learning model of the plurality of machine learning models (Paragraphs 28, 83, and 90).
The examiner notes that McMahan teaches “for each respective machine learning model of a plurality of machine learning models: for each respective remote processing device of a plurality of remote processing devices: sending, from a server to the respective remote processing device, an initial set of global model parameters for the respective machine learning model” as “In round t≥0, the server distributes the current model W.sub.t to a subset S.sub.t of n.sub.t clients (for example, to a selected subset of clients whose devices are plugged into power, have access to broadband, and are idle). Some or all of these clients independently update the model based on their local data” (Paragraph 28) and “At (310), method (300) can include providing the global model to each client device, and at (312), method (300) can include receiving the global model” (Paragraph 83). The examiner further notes that the initial distribution of an initial global model (which entails an initial set of parameters) from a server to multiple clients (i.e. remote processing devices) teaches the claimed sending. The examiner further notes that McMahan teaches “receiving, at the server from the respective remote processing device, an updated set of model parameters for the respective machine learning model” as “In round t≥0, the server distributes the current model W.sub.t to a subset S.sub.t of n.sub.t clients (for example, to a selected subset of clients whose devices are plugged into power, have access to broadband, and are idle). Some or all of these clients independently update the model based on their local data. The updated local models are W.sub.t.sup.1, W.sub.t.sup.2, . . . , W.sub.t.sup.n” (Paragraph 28), “Each client then sends the update back to the server, where the global update is computed by aggregating all the client-side updates” (Paragraph 29), and “At (318), method (300) can include receiving, by the server, the local update. In particular, the server can receive a plurality of local updates from a plurality of client devices” (Paragraph 88). The examiner further notes that the server obtaining subsequent transmitted client updates (i.e. updated set of model parameters) of its local model from each client teaches the claimed receiving. The examiner further notes that McMahan teaches “performing, at the server, an optimization of the respective machine learning model based on the updated set of model parameters received from each remote processing device of the plurality of remote processing devices to generate an updated set of global model parameters” as “Each client then sends the update back to the server, where the global update is computed by aggregating all the client-side updates” (Paragraph 29) and “At (320), method (300) can include again determining the global model. In particular, the global model can be determined based at least in part on the received local update(s). For instance, the received local updates can be aggregated to determine the global model. The aggregation can be an additive aggregation and/or an averaging aggregation. In particular implementations, the aggregation of the local updates can be proportional to the partition sizes of the data examples on the client devices” (Paragraph 89). The examiner further notes that the aggregation of the received client updates results in a generation of an updated global model. The examiner further notes that McMahan teaches “sending, from the server to each remote processing device of the plurality of remote processing devices, the updated set of global model parameters for each machine learning model of the plurality of machine learning models” as “In round t≥0, the server distributes the current model W.sub.t to a subset S.sub.t of n.sub.t clients (for example, to a selected subset of clients whose devices are plugged into power, have access to broadband, and are idle). Some or all of these clients independently update the model based on their local data” (Paragraph 28), “At (310), method (300) can include providing the global model to each client device, and at (312), method (300) can include receiving the global model” (Paragraph 83), and “Any number of iterations of local and global updates can be performed. That is, method (300) can be performed iteratively to update the global model based on locally stored training data over time” (Paragraph 90). The examiner further notes that the iterative distribution of a global model (which entails a set of updated parameters) from a server to multiple clients (i.e. remote processing devices) teaches the claimed sending.

Regarding claims 18 and 26, McMahan further teaches a method and computer and processing device comprising:
A) for each respective machine learning model of the plurality of machine learning models, determining a corresponding density estimator parameterized by weighting parameters for the respective machine learning model (Paragraphs 46).
The examiner notes that McMahan teaches “for each respective machine learning model of the plurality of machine learning models, determining a corresponding density estimator parameterized by weighting parameters for the respective machine learning model” as “Another way of encoding the updates is by quantizing the weights. For example, the weights can be probabilistically quantized” (Paragraph 46). The examiner further notes that specification merely mentions the claimed density estimator without defining what it constitutes (See Paragraph 78). Thus, the claimed calculated probabilistic updates for the models (that includes weights) teaches the claimed density estimator in the broadest reasonable interpretation.

Regarding claims 19 and 27, McMahan further teaches a method and computer and processing device comprising:
A) determining prior mixture weights for the respective machine learning model (Paragraphs 29-30).
The examiner notes that McMahan teaches “determining prior mixture weights for the respective machine learning model” as “Each client then sends the update back to the server, where the global update is computed by aggregating all the client-side updates…in some implementations, a weighted sum might be used to replace the average based on desired performance” (Paragraphs 29-30). The examiner further notes that calculated a weighted sum entails determining weights for the models.

Regarding claims 20 and 28, McMahan further teaches a method and processing device comprising:
A) wherein the plurality of remote processing devices comprises a smartphone (Paragraph 69).
The examiner notes that McMahan teaches “wherein the plurality of remote processing devices comprises a smartphone” as “The server 210 can exchange data with one or more client devices 230 over the network 242. Any number of client devices 230 can be connected to the server 210 over the network 242. Each of the client devices 230 can be any suitable type of computing device, such as a general purpose computer, special purpose computer, laptop, desktop, mobile device, navigation system, smartphone, tablet, wearable computing device, gaming console, a display with one or more processors, or other suitable computing device” (Paragraph 69). The examiner further notes that the client devices of McMahan can include smartphones.

Regarding claim 21, McMahan further teaches a method comprising:
A) wherein the plurality of remote processing devices comprise an internet of things device (Paragraph 69).
The examiner notes that McMahan teaches “wherein the plurality of remote processing devices comprise an internet of things device” as “The server 210 can exchange data with one or more client devices 230 over the network 242. Any number of client devices 230 can be connected to the server 210 over the network 242. Each of the client devices 230 can be any suitable type of computing device, such as a general purpose computer, special purpose computer, laptop, desktop, mobile device, navigation system, smartphone, tablet, wearable computing device, gaming console, a display with one or more processors, or other suitable computing device” (Paragraph 69). The examiner further notes that the client devices of McMahan can include wearable computing devices (i.e. IOT devices in the broadest reasonable interpretation).

Regarding claims 22 and 29, McMahan further teaches a method and processing device comprising:
A) wherein each respective machine learning model of the plurality of machine learning models is a neural network model (Paragraph 55).
The examiner notes that McMahan teaches “wherein each respective machine learning model of the plurality of machine learning models is a neural network model” as “FIG. 1 depicts an example system 100 for training one or more global machine learning models 106 using respective training data 108 stored locally on a plurality of client devices 102. System 100 can include a server device 104. Server 104 can be configured to access machine learning model 106, and to provide model 106 to a plurality of client devices 102. Model 106 can be, for instance, a linear regression model, logistic regression model, a support vector machine model, a neural network (e.g. convolutional neural network, recurrent neural network, etc.), or other suitable model” (Paragraph 55). The examiner further notes that the models distributed to the client devices of McMahan can be neural network models.

Regarding claims 23 and 30, McMahan further teaches a method and processing device comprising:
A) wherein each respective machine learning model of the plurality of machine learning models comprises a same network structure (Paragraphs 28, 55, and 83).
The examiner notes that McMahan teaches “wherein each respective machine learning model of the plurality of machine learning models comprises a same network structure” as “In round t≥0, the server distributes the current model W.sub.t to a subset S.sub.t of n.sub.t clients (for example, to a selected subset of clients whose devices are plugged into power, have access to broadband, and are idle). Some or all of these clients independently update the model based on their local data” (Paragraph 28), “FIG. 1 depicts an example system 100 for training one or more global machine learning models 106 using respective training data 108 stored locally on a plurality of client devices 102. System 100 can include a server device 104. Server 104 can be configured to access machine learning model 106, and to provide model 106 to a plurality of client devices 102. Model 106 can be, for instance, a linear regression model, logistic regression model, a support vector machine model, a neural network (e.g. convolutional neural network, recurrent neural network, etc.), or other suitable model” (Paragraph 55), and “At (310), method (300) can include providing the global model to each client device, and at (312), method (300) can include receiving the global model” (Paragraph 83). The examiner further notes that the initial distribution of an initial global model (which entails an initial set of parameters) from a server to multiple clients (i.e. remote processing devices) entails that each local model has the same network structure.
Claim Rejections - 35 USC § 103
12. In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
13. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
14. This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
15. Claims 1-6 and 9-13 are rejected under 35 U.S.C. 103 as being unpatentable over McMahan et al. (U.S. PGPUB 2019/0340534) as applied to claims 16, 18-24, and 26-30 above, and further in view of Wood et al. (U.S. PGPUB 2018/0285759).
16. Regarding claims 1 and 9, McMahan teaches a method and processing device comprising:
A) receiving, at an processing device, a set of global parameters for each machine learning model of a plurality of machine learning models (Paragraphs 22, 27, 28, and 83);
B) for each respective machine learning model of the plurality of machine learning models: processing, at the processing device, data stored locally on the processing device with respective machine learning model according to the set of global parameters to generate a machine learning model output (Paragraphs 22, 27, 28, and 84);
D) performing, at the processing device, an optimization of the respective machine learning model based on the machine learning model output to generate locally updated machine learning model parameters (Paragraphs 28 and 84); and
E) sending the locally updated machine learning model parameters to a remote processing device (Paragraphs 28-29 and 88); and
F) receiving, from the remote processing device, a set of globally updated machine learning model parameters for each machine learning model of the plurality of machine learning models (Paragraphs 22, 27, 28-29, and 88);
G) wherein the set of globally updated machine learning model parameters for each respective machine learning model are based at least in part on the locally updated machine learning model parameters (Paragraphs 22, 27, 28-29, 88, and 90).
The examiner notes that McMahan teaches “receiving, at an processing device, a set of global parameters for each machine learning model of a plurality of machine learning models” as “systems implementing federated learning can perform the following actions in each of a plurality of rounds of model optimization: a subset of clients are selected; each client in the subset updates the model based on their local data; the updated models or model updates are sent by each client to the server” (Paragraph 22), “the model can include one or more neural networks (e.g., deep neural networks, recurrent neural networks, convolutional neural networks, etc.) or other machine-learned models” (Paragraph 27), “In round t≥0, the server distributes the current model W.sub.t to a subset S.sub.t of n.sub.t clients (for example, to a selected subset of clients whose devices are plugged into power, have access to broadband, and are idle). Some or all of these clients independently update the model based on their local data” (Paragraph 28) and “At (310), method (300) can include providing the global model to each client device, and at (312), method (300) can include receiving the global model” (Paragraph 83). The examiner further notes that the initial distribution of an initial global model (which entails an initial set of parameters) from a server to multiple clients (i.e. remote processing devices) teaches the claimed receiving. Moreover, a client can include one or more models (each of which can house one or more models) that the global parameters are applied to from the server. The examiner further notes that McMahan teaches “for each respective machine learning model of the plurality of machine learning models: processing, at the processing device, data stored locally on the processing device with respective machine learning model according to the set of global parameters to generate a machine learning model output” as “systems implementing federated learning can perform the following actions in each of a plurality of rounds of model optimization: a subset of clients are selected; each client in the subset updates the model based on their local data; the updated models or model updates are sent by each client to the server” (Paragraph 22), “the model can include one or more neural networks (e.g., deep neural networks, recurrent neural networks, convolutional neural networks, etc.) or other machine-learned models” (Paragraph 27), “In round t≥0, the server distributes the current model W.sub.t to a subset S.sub.t of n.sub.t clients (for example, to a selected subset of clients whose devices are plugged into power, have access to broadband, and are idle). Some or all of these clients independently update the model based on their local data” (Paragraph 28), and “method (300) can include determining, by the client device, a local update. In a particular implementation, the local update can be determined by retraining or otherwise updating the global model based on the locally stored training data” (Paragraph 84). The examiner further notes that a client can include one or more models (each of which can house one or more models) that the global parameters are applied to from the server which are then updated via the use of local client data. The examiner further notes that McMahan teaches “performing, at the processing device, an optimization of the respective machine learning model based on the machine learning model output to generate locally updated machine learning model parameters” as “In round t≥0, the server distributes the current model W.sub.t to a subset S.sub.t of n.sub.t clients (for example, to a selected subset of clients whose devices are plugged into power, have access to broadband, and are idle). Some or all of these clients independently update the model based on their local data” (Paragraph 28) and “method (300) can include determining, by the client device, a local update. In a particular implementation, the local update can be determined by retraining or otherwise updating the global model based on the locally stored training data” (Paragraph 84). The examiner further notes that the determined local updates are based off of retraining (i.e. an example of the undefined claimed optimization in the broadest reasonable interpretation). The examiner further notes that McMahan teaches “sending the locally updated machine learning model parameters to a remote processing device” as “In round t≥0, the server distributes the current model W.sub.t to a subset S.sub.t of n.sub.t clients (for example, to a selected subset of clients whose devices are plugged into power, have access to broadband, and are idle). Some or all of these clients independently update the model based on their local data. The updated local models are W.sub.t.sup.1, W.sub.t.sup.2, . . . , W.sub.t.sup.n” (Paragraph 28), “Each client then sends the update back to the server, where the global update is computed by aggregating all the client-side updates” (Paragraph 29), and “At (318), method (300) can include receiving, by the server, the local update. In particular, the server can receive a plurality of local updates from a plurality of client devices” (Paragraph 88). The examiner further notes that the transmitted local updates to the server teaches the claimed sending. The examiner further notes that McMahan teaches “receiving, at an processing device, a set of global parameters for each machine learning model of a plurality of machine learning models” as “systems implementing federated learning can perform the following actions in each of a plurality of rounds of model optimization: a subset of clients are selected; each client in the subset updates the model based on their local data; the updated models or model updates are sent by each client to the server” (Paragraph 22), “the model can include one or more neural networks (e.g., deep neural networks, recurrent neural networks, convolutional neural networks, etc.) or other machine-learned models” (Paragraph 27), “In round t≥0, the server distributes the current model W.sub.t to a subset S.sub.t of n.sub.t clients (for example, to a selected subset of clients whose devices are plugged into power, have access to broadband, and are idle). Some or all of these clients independently update the model based on their local data. The updated local models are W.sub.t.sup.1, W.sub.t.sup.2, . . . , W.sub.t.sup.n” (Paragraph 28), “Each client then sends the update back to the server, where the global update is computed by aggregating all the client-side updates” (Paragraph 29), and “At (318), method (300) can include receiving, by the server, the local update. In particular, the server can receive a plurality of local updates from a plurality of client devices” (Paragraph 88). The examiner further notes that the server obtaining subsequent transmitted client updates (i.e. updated set of model parameters) of its local model(s) (each of which can include one or more models) from each client teaches the claimed receiving. The examiner further notes that McMahan teaches “wherein the set of globally updated machine learning model parameters for each respective machine learning model are based at least in part on the locally updated machine learning model parameters” as “systems implementing federated learning can perform the following actions in each of a plurality of rounds of model optimization: a subset of clients are selected; each client in the subset updates the model based on their local data; the updated models or model updates are sent by each client to the server” (Paragraph 22), “the model can include one or more neural networks (e.g., deep neural networks, recurrent neural networks, convolutional neural networks, etc.) or other machine-learned models” (Paragraph 27), “In round t≥0, the server distributes the current model W.sub.t to a subset S.sub.t of n.sub.t clients (for example, to a selected subset of clients whose devices are plugged into power, have access to broadband, and are idle). Some or all of these clients independently update the model based on their local data. The updated local models are W.sub.t.sup.1, W.sub.t.sup.2, . . . , W.sub.t.sup.n” (Paragraph 28), “Each client then sends the update back to the server, where the global update is computed by aggregating all the client-side updates” (Paragraph 29), “At (318), method (300) can include receiving, by the server, the local update. In particular, the server can receive a plurality of local updates from a plurality of client devices” (Paragraph 88), and “Any number of iterations of local and global updates can be performed. That is, method (300) can be performed iteratively to update the global model based on locally stored training data over time” (Paragraph 90). The examiner further notes that the server obtaining subsequent transmitted client updates (i.e. updated set of model parameters) of its local model(s) results in updated the global model for that iteration.
McMahan does not explicitly teach:
C) receiving, at the processing device, user feedback regarding the machine learning model output;
D) performing, at the processing device, an optimization of the respective machine learning model based on the machine learning model output and the user feedback associated with machine learning model output.
Wood, however, teaches “receiving, at the processing device, user feedback regarding the machine learning model output” as “statistical model 108 may be trained and/or adapted to new data received on the trainers. For example, the trainers may execute on electronic devices (e.g., personal computers, laptop computers, mobile phones, tablet computers, portable media players, digital cameras, etc.) that produce updates 114-116 to statistical model 108 based on user feedback from users of the electronic devices” (Paragraph 18), “the statistical model may have multiple local versions 202 and one or more global versions 204. Individual local versions 202 may be personalized to specific users, recommendations, job listings, advertisements, content items, and/or other types of entities 218. Output 212 from each local version may be displayed and/or otherwise presented to one or more users, and user feedback 206 and/or other input data related to output 212 may be collected and/or tracked” (Paragraph 37), and “User feedback 206 related to output 212 may additionally be collected during the user session as clicks, views, searches, likes, dislikes, comments, shares, applications to job listings, and/or other interaction with the online professional network. Each piece of user feedback 206 may be included in training data that is applied to parameters 224 of the local version to generate an update (e.g., updates 222) to the local version. Consequently, the output of the local version may be adapted to the user's real-time behavior or preferences during the user session” (Paragraph 39) and “performing, at the processing device, an optimization of the respective machine learning model based on the machine learning model output and the user feedback associated with machine learning model output” as “statistical model 108 may be trained and/or adapted to new data received on the trainers. For example, the trainers may execute on electronic devices (e.g., personal computers, laptop computers, mobile phones, tablet computers, portable media players, digital cameras, etc.) that produce updates 114-116 to statistical model 108 based on user feedback from users of the electronic devices” (Paragraph 18), “the statistical model may have multiple local versions 202 and one or more global versions 204. Individual local versions 202 may be personalized to specific users, recommendations, job listings, advertisements, content items, and/or other types of entities 218. Output 212 from each local version may be displayed and/or otherwise presented to one or more users, and user feedback 206 and/or other input data related to output 212 may be collected and/or tracked” (Paragraph 37), and “User feedback 206 related to output 212 may additionally be collected during the user session as clicks, views, searches, likes, dislikes, comments, shares, applications to job listings, and/or other interaction with the online professional network. Each piece of user feedback 206 may be included in training data that is applied to parameters 224 of the local version to generate an update (e.g., updates 222) to the local version. Consequently, the output of the local version may be adapted to the user's real-time behavior or preferences during the user session” (Paragraph 39).
The examiner further notes that the secondary reference of Wood teaches the concept of the use of user feedback (which is based on the output of a local model) as a basis for training (i.e. an example of the claimed undefined optimizing in the broadest reasonable interpretation) the local model. The combination would result in using user feedback as a basis to generate the local updates to the local models of McMahan.
It would have been obvious to one of ordinary skill in the art before the effective filing date of instant invention to combine the teachings of the cited references because teaching Wood’s would have allowed McMahan’s to provide a method for tuning based off of a pre-specified amount of user feedback, as noted by Wood (Paragraph 32).

Regarding claims 2 and 10, McMahan further teaches a method and processing device comprising:
A) performing at the processing device, a number of optimizations before sending the locally updated machine learning model parameters to the remote processing device (Paragraphs 41, 42, and 49)
The examiner notes that McMahan teaches “performing at the processing device, a number of optimizations before sending the locally updated machine learning model parameters to the remote processing device” as “A second type of communication efficient update provided by the present disclosure is a sketched update in which the client encodes the update H.sub.t.sup.i in a compressed form prior to sending to the server. The client device can compute the full update H.sub.t.sup.i and then encode the update or can compute the update H.sub.t.sup.i according to a structured technique and then encode such structured update” (Paragraph 41), “Many different types of encoding or compression are envisioned by the present disclosure. For example, the compression can be lossless compression or lossy compression. Two example encoding techniques are described in further detail below: a subsampling technique and a quantization technique” (Paragraph 42), and “the above can be generalized to more than 1 bit for each scalar. For example, for b-bit quantization, [h.sub.min, h.sub.max] can be equally divided into 2.sup.b intervals. Suppose h.sub.i falls in the interval bounded by h′ and h″. The quantization can operate by replacing h.sub.min and h.sub.max in the above equation with h′ and h″, respectively” (Paragraph 49). The examiner further notes that encoding the local updates before sending such updates back to the server entails performing a number of “optimizations” (which are not defined in the claims).

Regarding claims 3 and 11, McMahan further teaches a method and processing device comprising:
A) wherein the set of globally updated machine learning model parameters for each respective machine learning model of the plurality of machine learning models are based at least in part on locally updated machine learning model parameters of a second processing device (Paragraphs 29 and 88)
The examiner notes that McMahan teaches “performing at the processing device, a number of optimizations before sending the locally updated machine learning model parameters to the remote processing device” as “In round t≥0, the server distributes the current model W.sub.t to a subset S.sub.t of n.sub.t clients (for example, to a selected subset of clients whose devices are plugged into power, have access to broadband, and are idle). Some or all of these clients independently update the model based on their local data. The updated local models are W.sub.t.sup.1, W.sub.t.sup.2, . . . , W.sub.t.sup.n” (Paragraph 28), “Each client then sends the update back to the server, where the global update is computed by aggregating all the client-side updates” (Paragraph 29), and “At (318), method (300) can include receiving, by the server, the local update. In particular, the server can receive a plurality of local updates from a plurality of client devices” (Paragraph 88). The examiner further notes that the server obtaining subsequent transmitted client updates (i.e. locally updated parameters) of its local model from each client entails a second processing device sending its local updates (i.e. multiple clients includes at least a first processing device and a second processing device).

Regarding claims 4 and 12, McMahan does not explicitly teach a method and processing device comprising:
A) wherein the user feedback comprises an indication of a correctness of the machine learning model output.
Wood, however, teaches “wherein the user feedback comprises an indication of a correctness of the machine learning model output” as “statistical model 108 may be trained and/or adapted to new data received on the trainers. For example, the trainers may execute on electronic devices (e.g., personal computers, laptop computers, mobile phones, tablet computers, portable media players, digital cameras, etc.) that produce updates 114-116 to statistical model 108 based on user feedback from users of the electronic devices” (Paragraph 18), “the statistical model may have multiple local versions 202 and one or more global versions 204. Individual local versions 202 may be personalized to specific users, recommendations, job listings, advertisements, content items, and/or other types of entities 218. Output 212 from each local version may be displayed and/or otherwise presented to one or more users, and user feedback 206 and/or other input data related to output 212 may be collected and/or tracked” (Paragraph 37), and “User feedback 206 related to output 212 may additionally be collected during the user session as clicks, views, searches, likes, dislikes, comments, shares, applications to job listings, and/or other interaction with the online professional network. Each piece of user feedback 206 may be included in training data that is applied to parameters 224 of the local version to generate an update (e.g., updates 222) to the local version. Consequently, the output of the local version may be adapted to the user's real-time behavior or preferences during the user session” (Paragraph 39).
The examiner further notes that the secondary reference of Wood teaches the concept of the use of user feedback (which is based on the output of a local model). Such user feedback data includes “likes” (i.e. the claimed indication of correctness in the broadest reasonable interpretation). The combination would result in using user feedback as a basis to generate the local updates to the local models of McMahan.
It would have been obvious to one of ordinary skill in the art before the effective filing date of instant invention to combine the teachings of the cited references because teaching Wood’s would have allowed McMahan’s to provide a method for tuning based off of a pre-specified amount of user feedback, as noted by Wood (Paragraph 32).

Regarding claim 5, McMahan further teaches a method comprising:
A) wherein the data stored locally on the processing device is one of: image data, audio data, or video data (Paragraph 56).
The examiner notes that McMahan teaches “wherein the data stored locally on the processing device is one of: image data, audio data, or video data” as “Client devices 102 can each be configured to determine one or more local updates associated with model 106 based at least in part on training data 108. For instance, training data 108 can be data that is respectively stored locally on the client devices 106. The training data 108 can include audio files, image files, video files, a typing history, location history, and/or various other suitable data” (Paragraph 56). The examiner further notes that the client local data includes audio, image, and video data.

Regarding claims 6 and 13, McMahan further teaches a method and processing device comprising:
A) wherein the processing device is one of a smartphone or an internet of things device (Paragraph 69).
The examiner notes that McMahan teaches “wherein the processing device is one of a smartphone or an internet of things device” as “The server 210 can exchange data with one or more client devices 230 over the network 242. Any number of client devices 230 can be connected to the server 210 over the network 242. Each of the client devices 230 can be any suitable type of computing device, such as a general purpose computer, special purpose computer, laptop, desktop, mobile device, navigation system, smartphone, tablet, wearable computing device, gaming console, a display with one or more processors, or other suitable computing device” (Paragraph 69). The examiner further notes that the client devices of McMahan can include smartphones.
17. Claims 7-8 and 14-15 are rejected under 35 U.S.C. 103 as being unpatentable over McMahan et al. (U.S. PGPUB 2019/0340534) as applied to claims 16, 18-24, and 26-30 above, and further in view of Wood et al. (U.S. PGPUB 2018/0285759) as applied to claims 1-6 and 9-13 above, and further in view of Feng et al. (Article entitled “Joint Service Pricing and Cooperative Relay Communication for Federated Learning”, dated 29 November 2018).
18. Regarding claims 7 and 14, McMahan further teaches a method and processing device comprising:
A) wherein processing, at the processing device, the data stored locally on the processing device with the machine learning model is performed at least in part by one or more processing units (Paragraphs 5, 22, 27, 28, and 84).
The examiner notes that McMahan teaches “wherein processing, at the processing device, the data stored locally on the processing device with the machine learning model is performed at least in part by one or more processing units” as “he client device includes at least one processor and at least one non-transitory computer-readable medium that stores instructions that, when executed by the at least one processor, cause the client computing device to perform operations” (Paragraph 5), “systems implementing federated learning can perform the following actions in each of a plurality of rounds of model optimization: a subset of clients are selected; each client in the subset updates the model based on their local data; the updated models or model updates are sent by each client to the server” (Paragraph 22), “the model can include one or more neural networks (e.g., deep neural networks, recurrent neural networks, convolutional neural networks, etc.) or other machine-learned models” (Paragraph 27), “In round t≥0, the server distributes the current model W.sub.t to a subset S.sub.t of n.sub.t clients (for example, to a selected subset of clients whose devices are plugged into power, have access to broadband, and are idle). Some or all of these clients independently update the model based on their local data” (Paragraph 28), and “method (300) can include determining, by the client device, a local update. In a particular implementation, the local update can be determined by retraining or otherwise updating the global model based on the locally stored training data” (Paragraph 84). The examiner further notes that a client (which includes one or more processors) can include one or more models (each of which can house one or more models) that the global parameters are applied to from the server which are then updated via the use of local client data.
McMahan and Wood do not explicitly teach:
A) one or more neural processing units.
Feng, however, teaches “one or more neural processing units” as “For the sake of protecting data privacy and due to the rapid development of mobile devices, e.g., powerful central processing unit (CPU) and nascent neural processing unit (NPU), collaborative machine learning on mobile devices, e.g., federated learning, has been envisioned as a new AI approach with broad application prospects” (Abstract).
The examiner further notes that although McMahan and Wood clearly teach client devices with processors for federated learning, there is no explicit teaching that such clients include a neural processor. Nevertheless, the secondary reference of Feng teaches that clients in a federated learning environment can include neural processors. The combination would result in the clients of McMahan and Wood to also include neural processors.
It would have been obvious to one of ordinary skill in the art before the effective filing date of instant invention to combine the teachings of the cited references because teaching Feng’s would have allowed McMahan’s and Wood’s to provide a method for handling massive data in a secure manner, as noted by Feng (Section 1).

Regarding claims 8 and 15, McMahan further teaches a method and processing device comprising:
A) wherein performing, at the processing device, the optimization of the machine learning model is performed at least in part by one or more processing units (Paragraphs 5, 28, and 84).
The examiner notes that McMahan teaches “wherein processing, at the processing device, the data stored locally on the processing device with the machine learning model is performed at least in part by one or more processing units” as “he client device includes at least one processor and at least one non-transitory computer-readable medium that stores instructions that, when executed by the at least one processor, cause the client computing device to perform operations” (Paragraph 5), “In round t≥0, the server distributes the current model W.sub.t to a subset S.sub.t of n.sub.t clients (for example, to a selected subset of clients whose devices are plugged into power, have access to broadband, and are idle). Some or all of these clients independently update the model based on their local data” (Paragraph 28) and “method (300) can include determining, by the client device, a local update. In a particular implementation, the local update can be determined by retraining or otherwise updating the global model based on the locally stored training data” (Paragraph 84). The examiner further notes that the determined local updates at the clients (which have one or more processors) are based off of retraining (i.e. an example of the undefined claimed optimization in the broadest reasonable interpretation).
McMahan and Wood do not explicitly teach:
A) one or more neural processing units.
Feng, however, teaches “one or more neural processing units” as “For the sake of protecting data privacy and due to the rapid development of mobile devices, e.g., powerful central processing unit (CPU) and nascent neural processing unit (NPU), collaborative machine learning on mobile devices, e.g., federated learning, has been envisioned as a new AI approach with broad application prospects” (Abstract).
The examiner further notes that although McMahan and Wood clearly teach client devices with processors for federated learning, there is no explicit teaching that such clients include a neural processor. Nevertheless, the secondary reference of Feng teaches that clients in a federated learning environment can include neural processors. The combination would result in the clients of McMahan and Wood to also include neural processors.
It would have been obvious to one of ordinary skill in the art before the effective filing date of instant invention to combine the teachings of the cited references because teaching Feng’s would have allowed McMahan’s and Wood’s to provide a method for handling massive data in a secure manner, as noted by Feng (Section 1).
19. Claims 17 and 25 are rejected under 35 U.S.C. 103 as being unpatentable over McMahan et al. (U.S. PGPUB 2019/0340534) as applied to claims 16, 18-24, and 26-30 above, and further in view of McMahan et al. ((U.S. PGPUB 2017/0109322) (herein referred to as “Konecny”)).
20. Regarding claims 17 and 25, McMahan does not explicitly teach a method and processing device comprising:
A) wherein performing, at the server, an optimization of the respective machine learning model comprises computing an effective gradient for each model parameter of the initial set of global model parameters for the respective machine learning model.
Konecny, however, teaches “wherein performing, at the server, an optimization of the respective machine learning model comprises computing an effective gradient for each model parameter of the initial set of global model parameters for the respective machine learning model” as “stochastic gradient descent techniques can be naively applied to the optimization problem, wherein one or more “minibatch” gradient calculations (e.g. using one or more randomly selected use devices) are performed per round of communication. For instance, the minibatch can include at least a subset of the training data stored locally on the user devices. In such implementations, one or more user devices can be configured to determine the average gradient associated with the local training data respectively stored on the user devices for a current version of a model. The user devices can be configured to provide the determined gradients to the server, as part of the local updates. The server can then aggregate the gradients to determine a global model update” (Paragraph 29).
The examiner further notes that the secondary reference of Konecny teaches the concept computing a gradient of a set of parameters for optimization at a server. The combination would result in performing a gradient computation for the optimization of McMahan.
It would have been obvious to one of ordinary skill in the art before the effective filing date of instant invention to combine the teachings of the cited references because teaching Konecny’s would have allowed McMahan’s to provide a method for solving optimization issues in distributed systems, as noted by Konecny (Paragraph 5).
Conclusion
21. The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
U.S. PGPUB 2020/0285980 issued to Sharad et al. on 10 September 2020. The subject matter disclosed therein is pertinent to that of claims 1-30 (e.g., methods to perform federated training).
U.S. PGPUB 2021/0287080 issued to Moloney et al. on 16 September 2021. The subject matter disclosed therein is pertinent to that of claims 1-30 (e.g., methods to perform federated training).
Contact Information
22. Any inquiry concerning this communication or earlier communications from the examiner should be directed to Mahesh Dwivedi whose telephone number is (571) 272-2731. The examiner can normally be reached on Monday to Friday 8:20 am – 4:40 pm.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Charles Rones can be reached (571) 272-4085. The fax number for the organization where this application or proceeding is assigned is (571) 273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see 20. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).

Mahesh Dwivedi
Primary Examiner
Art Unit 2168

May 08, 2025
/MAHESH H DWIVEDI/Primary Examiner, Art Unit 2168