Patent Application 15491950 - DISTRIBUTED DEEP LEARNING USING A DISTRIBUTED - Rejection
Appearance
Patent Application 15491950 - DISTRIBUTED DEEP LEARNING USING A DISTRIBUTED
Title: DISTRIBUTED DEEP LEARNING USING A DISTRIBUTED DEEP NEURAL NETWORK
Application Information
- Invention Title: DISTRIBUTED DEEP LEARNING USING A DISTRIBUTED DEEP NEURAL NETWORK
- Application Number: 15491950
- Submission Date: 2025-05-13T00:00:00.000Z
- Effective Filing Date: 2017-04-19T00:00:00.000Z
- Filing Date: 2017-04-19T00:00:00.000Z
- National Class: 706
- National Sub-Class: 025000
- Examiner Employee Number: 94552
- Art Unit: 2125
- Tech Center: 2100
Rejection Summary
- 102 Rejections: 2
- 103 Rejections: 2
Cited Patents
The following patents were cited in the rejection:
Office Action Text
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Response to Amendment This action is in response to the amendment filed 10/23/2024 concurrently with a petition to revive an abandoned application on the grounds that the failure to reply was unintentional (37 CFR 1.137(a)), which was granted on 1/02/2025.1 In the amendment, claim 1, 4., 8, 10 and 16-20 were amended, and no claims were cancelled or added. Thus, claims 1-20 are pending. The objections to the drawings and specification, set forth in the previous Office Action, are withdrawn in view of the 10/23/2024 amendments to the specification. The previous objections to claims 10 and 16-20 are withdrawn in view of the amendments to the claims. However, objections to claims 8-20 remain, as documented below. The previous rejections of claims 4, 6 and 16-20 under 35 U.S.C. 112(b) are withdrawn in view of the amendments to the claims. However, rejections of claims 1-20 under 35 U.S.C. 112(b) remain, as documented below. Response to Arguments Applicant's arguments filed 10/23/2024 with respect to the objections to the specification and drawings previously set forth in the previous Office Action have been fully considered and are persuasive. Applicant's arguments filed 10/23/2024 with respect to the objections to the specification and drawings previously set forth in the previous Office Action have been fully considered and are persuasive. Applicant's arguments with respect to the objections to claims 10 and 16-20, set forth in the previous Office Action, have been fully considered and are persuasive. However, objections to claims 8-20 remain, as documented below. Applicant's arguments with respect to the rejections of claims 4, 6 and 16-20 under 35 U.S.C. 112(b), set forth in the previous Office Action have been fully considered and are persuasive. However, rejections of claims 1-7 and 16-20 under 35 U.S.C. 112(b) remain, as documented below. Applicant's arguments with respect to the rejections of claims 1-20 under 35 U.S.C. 103 have been fully considered but are moot because the arguments do not apply to the combination of references used in the current rejections. In particular, as discussed in detail below, a new combination of references (i.e., Hammond in view of Vu and further in view of the newly-cited reference Gopalan (U.S. Patent Application Pub. No. 2018/0278543 A1, hereinafter “Gopalan”), is applied to reject amended independent claims 1, 8 and 16 as well as dependent claims 2-6, 9-11, 13-15 and 17-20. Also, another new combination of references (i.e., Hammond in view of Vu and the newly-cited Gopalan reference, and further in view of Ellenbogen, is applied to reject amended dependent claims 7 and 12. With reference to amended claim 1, Applicant states “Currently amended independent claim 1 recites: transmitting, from the first host system to the second host system, a new model trained on data received from one or more remote locations; training, at the second host system, a new model, customized as a function of historical data at the second host system; evaluating, at the second host system, the new model; and installing the new model at the second host system applied to live event stream data. Applicant respectfully submits that Hammond and Vu, whether taken alone, in combination, or in combination with any of the other referenced prior art, teaches the aforementioned amendments to Claim 1.” before asserting “nothing in the prior art discusses” the above-noted limitations (applicant’s remarks, pages 8-9). With continued reference to amended claim 1 Applicant states “the Examiner points to numerous paragraphs in Hammond … for the showing of “receiving, by a first host system, training event data filtered from an event data stream by a second host system; training, by the first host system, a neural network, based on the filtered training event data", as required by Claim 1. However, "filter[ing]" is only mentioned as one word in paragraph [0080] of Hammond.” Id. Applicant then asserts, which examiner does not concede, that “In light of the foregoing, Applicant submits that Claim 1 should be in condition for allowance. Additionally, Claims 2-7 that flow from independent Claim 1 should also therefore be allowable. Finally, Claim 8 and Claims 9-15 that flow therefrom, and Claim 16, and Claims 17-20 that flow therefrom, should also be in condition for allowance, as they contain amendments that are similar to those in Claim 1” (applicant’s remarks, page 9). Accordingly, applicant appears to argue that the claim limitations recited, using respective similar language, in amended independent claims 1 and 16, i.e., “transmitting, from the first host system to the second host system, a new model trained on data received from one or more remote locations; training, at the second host system, a new model, customized as a function of historical data at the second host system; evaluating, at the second host system, the new model; and installing the new model at the second host system applied to live event stream data”2, are not taught in the portions of the Hammond and Vu references cited to reject independent claims 1 and 16 in the previous Office Action. Applicant also appears to argue that the claim limitations recited, using respective similar language, in amended independent claims 1 and 16, i.e., “receiving, by a first host system, training event data filtered from an event data stream by a second host system; training, by the first host system, a neural network, based on the filtered training event data”, are not taught in the portions of the Hammond and Vu references cited to reject independent claims 1 and 16 in the previous Office Action. The examiner respectfully disagrees with applicant’s assertions and points applicant to the below discussion of Hammond, Vu and the newly-cited Gopalan reference. Regarding the new limitation “transmitting, from the first host system to the second host system, a new model trained on data received from one or more remote locations” added, using respective similar language, to claims 1 and 16, the examiner points to paragraphs 32, 34, 38, 58 and 80 of Hammond, which explicitly disclose “an AI engine hosted on one or more remote servers … The one or more AI engine modules can include an instructor module and a learner module configured to train an AI model”, “train the neural network 104 to provide a trained neural network 106, and deploy the trained neural network 106 as a deployed neural network 108”, “the training data source 214 can send the training data to the AI generator”, “[t]he AI engine can be a cloud-hosted platform-as-a-service … [t]hus, the AI engine can be accessible with one or more client-side interfaces … let the online AI engine build and generate a trained intelligence model for one or more of the third parties”, “the learning system to use for training … can run on a local client machine and stream data to the remote AI engine for training” [i.e., transmitting/deploying new model 108 trained on data received from remote locations and the new model from a 1st host system/server to a 2nd host system via interface]. With regard to the newly-added limitation “training, at the second host system, a new model, customized as a function of historical data at the second host system” recited, using respective similar language, in claims 1 and 16, the examiner points to paragraphs 34, 58 and 80 of Hammond, which explicitly disclose “deploy[ing] the trained neural network 106 as a deployed neural network 108”, “build and generate a trained intelligence model for one or more of the third parties” [i.e., the 2nd host system], “the data can be … passed to the learning system … for the learning system to use for training” [i.e., training the new model 108 as a function of training data at the 2nd host system]. With continued reference to the above-noted new model, customized as a function of historical data at the second host system, the examiner also points to paragraphs 23, 26 and 36 of Gopalan, which explicitly disclose “prior to implementing machine learning application on actual/current network video traffic, the machine learning application is trained with historical network video traffic.”, “the neural network is trained with historical network video traffic. For example, a node may model a communication link … input modeling video traffic from media server 124 may be weighted”, “Retraining the machine learning application can include determining the multiple layers of the neural network of the machine learning application according to the re-training” [i.e., customizing/training the new neural network model as function of historical data at a host system/server]. Regarding the new limitation “evaluating, at the second host system, the new model” added, using respective similar language, to claims 1 and 16, the examiner points to paragraphs 27, 77 and 128 of Hammond, which explicitly disclose “An ‘AI model’ as used herein includes … neural networks”, “The AI engine or a predictor module thereof can … instantiate a number of trained neural networks [i.e., including the new AI model] based on the concepts learned by the number of neural networks in the one or more training cycles, and the AI engine can identify a best trained neural network (e.g., by means of optimal results based on factors such as performance time, accuracy, etc.) among the number of trained neural networks” [i.e., evaluating the new model based on prediction accuracy], “the one or more client systems 210 of FIGS. 2A and 3A … can include … the software application or the hardware-based system in which the trained neural network 106 can be deployed.” [i.e., the new model can be deployed to and evaluated by the second host system 210]. With regard to the newly-added “installing the new model at the second host system applied to live event stream data” limitation recited, using respective similar language, in claims 1 and 16, the examiner points to paragraphs 34, 36, 80, 82 and 91 of Hammond, which explicitly disclose “deploy[ing] the trained neural network 106 as a deployed neural network 108 … the trained AI model or the trained neural network 106 can be deployed in … a hardware-based system”, “included in the one or more server systems 220, or the training data source 214 can be include[d] in both the one or more client systems 210”, “run on a local client machine and stream data to the remote AI engine for training”, “Data can be streamed into the BRAIN server … the data can flow through the nodes in the BRAIN model”, “server can take a trained BRAIN model, enable API endpoints so that data can be streamed to and from the model” [i.e., deploying/installing the new model 108 at the 2nd host system 210, where the model is applied to stream data]. With continued reference to the above-noted the new model at the second host system applied to live event stream data limitation, the examiner also points to paragraphs 38-39 of Gopalan, which explicitly disclose that “media server 122 may provide media content of a live event … media server 122 can receive real-time, live media content of a live concert to be streamed to users”, “media server 122 can increase its level 402, of network video traffic … the network manager 102 re-provisions the network resources … according to the re-trained machine learning application. Retraining the machine learning application can include determining the multiple layers of the neural network of the machine learning application according to the re-training based on the current network video traffic.” [i.e., the new, re-trained neural network model at the 2nd host system/server is applied to live event stream data], With reference to the above-noted “receiving, by a first host system, training event data filtered from an event data stream by a second host system” limitation recited, using respective similar language, in claims 1 and 16, the examiner points to paragraphs 6, 32, 36, 46, 61 and 80 of Hammond, which explicitly disclose that “The AI system can further include one or more training data sources configured to provide training data, wherein the one or more training data sources includes at least one server-side training data source or at least one client-side training data source configured to provide the training data”, “an AI engine hosted on one or more remote servers” [i.e., including a first host system/server], “lessons for training the AI model … configured to optionally use a different flow of the training data”, “AI system 200 includes one or more client systems 210 [i.e., including a second host system] and one or more server systems 220 … client systems 210 can further include a training data source 214” [i.e., receive, by 1st host system, training data from a second host system/training data source 214], “one or more data transformation streams”, “AI system 500 can include a training data loader 521 configured to load training data … and a streaming data server 523. The training data can be … streamed training data”, “the data can be optionally filtered/augmented in the lessons before being passed to the learning system … subsequently produce a piece of data for the learning system to use for training” [i.e., filtered training data from a data stream]. Regarding to the above-noted “training, by the first host system, a neural network, based on the filtered training event data” limitation recited, using respective similar language, in claims 1 and 16, the examiner points to paragraphs 34, 36 and 38 of Hammond, which explicitly disclose “train[ing] the neural network 104 to provide a trained neural network 106”, “one or more server systems 220 can be remote server systems and include … an AI generator 223 for generating the trained neural network 106”, “The AI generator can request training data from the training data source 214, and the training data source 214 can send the training data to the AI generator 223 … AI generator 223 can subsequently train the neural network 104 on the training data … to provide a trained state of the neural network or the trained neural network 106” [i.e., train a neural network based on the filtered training data]. The examiner respectfully disagrees with applicant’s above-noted assertions and characterizations of the prior art references. As detailed below, the combination of Hammond, Vu and the newly-cited reference Gopalan teaches all of the limitations of amended independent claims 1, 8 and 16 as well as dependent claims 2-6, 9-11, 13-15 and 17-20. As further detailed below, another new combination of references (i.e., Hammond in view of Vu and Gopalan, and further in view of Ellenbogen) teaches the features of dependent claims 7 and 12. Claim Objections Claims 8-20 are objected to because of the following informalities: The last step of independent claim 8 recites “installing the second neural network on said second host system upon validation the second neural network performs more accurately than the baseline neural network.” This recitation is grammatically incorrect and appears to be missing punctuation and/or one or more words. In particular, it appears that one or more words are missing between “system” and “upon”, and commas “,” are missing between “host” and “system” and between “validation” and “the”. If supported by the original specification, the examiner suggests that one way to overcome this objection is to amend the last two lines of claim 8 to recite “installing the second neural network on said second host system, wherein upon validation, the second neural network performs more accurately than the baseline neural network.” Appropriate correction is required. Independent claim 16 includes a superfluous “and” between “event data;” and “evaluate” in lines 9-10. Appropriate correction is required. Claim 16 recites “a new model” in lines 12 and 14. It appears the second recitation of “a new model” should recite “[[a]] the new model” to refer to the previously-introduced ““a new model” (see, e.g., the subsequent recitation of “the new model” in line 16 of the claim. Appropriate correction is required. Also, claims 9-15 and 17-20 which each depend directly from claims 8 and 16, respectively, are objected to based on their respective dependencies from claims 8 and 16. Claim Rejections - 35 USC § 112 The following is a quotation of 35 U.S.C. 112(b): (b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention. Claims 1-20 are rejected are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention. The last line of amended claim 1 recites “installing the new model at the second host system applied to live event stream data.” This recitation is grammatically incorrect and unclear. In particular, it is unclear what is being “applied to live event stream data.” That is, it is unclear if the “new model” or the “second host system” is to be “applied to live event stream data.” For the purposes of determining patent eligibility and comparison with the prior art, the examiner is interpreting the term “installing the new model at the second host system applied to live event stream data” as “installing the new model at the second host system, wherein the new model is applied to live event stream data.” Appropriate correction is required. The last two lines of amended independent claim 8 recite “upon validation the second neural network performs more accurately than the baseline neural network”. This recitation appears to be missing one or more words and punctuation, and is unclear. In particular, the term “the second neural network performs more accurately than the baseline neural network” in independent claim 8 is a relative term which renders the claim indefinite. The term “more accurately" is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. Aside from merely repeating the claim language (See, e.g., paragraphs 6 and 13-14) and providing examples by stating “a model once built can be customized to be more accurate for a specific location.” and “In some designs, this process is typically repeated 10 or more times (called epochs) to increase the accuracy of the resulting neural network” (See, e.g., paragraphs 47-48), the specification does not provide a standard for ascertaining the requisite degree of accuracy for the recited “more accurately”. As such, the specification does not provide a standard, measurement or metric for determining the requisite degree of accuracy for the claimed “more accurately” in claim 8. For examination purposes, “upon validation the second neural network performs more accurately than the baseline neural network” is being interpreted as “wherein, upon validation, the second neural network produces output or results that have an accuracy that is greater than an accuracy of the baseline neural network.” The last line of amended independent claim 16 recites “install the new model at the second host system for application to live event stream data.” This recitation is grammatically incorrect and unclear. In particular, it is unclear what object or entity is “for application to live event stream data.” That is, it is unclear if the “new model” or the “second host system” is “for application to live event stream data.” For examination purposes “install the new model at the second host system for application to live event stream data” is being interpreted as “install the new model at the second host system wherein the new model is for application to live event stream data.” Appropriate correction is required. Also, claims 2-7, 9-15 and 17-20, which each depend directly or indirectly from claims 1, 8 and 16, respectively, are rejected under 35 U.S.C. 112(b) as being indefinite under the same rationale as claims 1, 8 and 16. Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action. The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 1. Determining the scope and contents of the prior art. 2. Ascertaining the differences between the prior art and the claims at issue. 3. Resolving the level of ordinary skill in the pertinent art. 4. Considering objective evidence present in the application indicating obviousness or nonobviousness. This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention. Claims 1-6, 8-11 and 13-20 are rejected under 35 U.S.C. 103 as being unpatentable over Hammond et al. (U.S. Patent Application Pub. No. 2017/0213128, hereinafter “Hammond”) in view of non-patent literature Vu, Van-Thinh et al. (“Audio-Video Event Recognition System for Public Transport Security.” (31 Jan 2006), hereinafter “Vu”) and further in view of Gopalan (U.S. Patent Application Pub. No. 2018/0278543 A1, hereinafter “Gopalan”). Hammond was filed on January 26, 2017 and claims priority to U.S. Provisional application No. 62/287,861, filed on January 27, 2016, and both of these dates are before the effective filing date of this application, i.e., April 19, 2017. Therefore, Hammond constitutes prior art under 35 U.S.C. 102(a)(2). Gopalan was filed on March 22, 2017, and this date is before the effective filing date of this application, i.e., April 19, 2017. Therefore, Gopalan constitutes prior art under 35 U.S.C. 102(a)(2). Regarding claim 1, Hammond discloses the invention as claimed including a method to train a neural network (see, e.g., paragraphs 27, 32, and 34, “An ‘AI model’ as used herein includes, but is not limited to, neural networks”, “AI-engine modules can include … a learner module configured to train an AI model”, “methods provided herein … train the AI model to provide a trained AI model”), comprising: receiving, by a first host system, training … data filtered from … data stream by a second host system (see, e.g., paragraphs 6, 32, 36, 46, 61 and 80, “The AI system can further include one or more training data sources configured to provide training data, wherein the one or more training data sources includes at least one server-side training data source or at least one client-side training data source configured to provide the training data”, “an AI engine hosted on one or more remote servers” [i.e., including a first host system/server], “lessons for training the AI model … configured to optionally use a different flow of the training data”, “AI system 200 includes one or more client systems 210 [i.e., including a second host system] and one or more server systems 220 … client systems 210 can further include a training data source 214” [i.e., receiving training data from a second host system/training data source 214], “one or more data transformation streams”, “AI system 500 can include a training data loader 521 configured to load training data … and a streaming data server 523. The training data can be … streamed training data”, “the data can be optionally filtered/augmented in the lessons before being passed to the learning system … subsequently produce a piece of data for the learning system to use for training” [i.e., filtered training data from a data stream]); training, by the first host system, a neural network, based on the filtered training … data (see, e.g., paragraphs 34, 36 and 38, “train the neural network 104 to provide a trained neural network 106”, “one or more server systems 220 can be remote server systems and include … an AI generator 223 for generating the trained neural network 106”, “The AI generator can request training data from the training data source 214, and the training data source 214 can send the training data to the AI generator 223 … AI generator 223 can subsequently train the neural network 104 on the training data … to provide a trained state of the neural network or the trained neural network 106” [i.e., training a neural network based on the filtered training data]); evaluating, by the first host system, the neural network (see, e.g., paragraphs 38 and 77, “The AI generator 223 can elicit a prediction from the trained neural network 106 and send the prediction”, “The AI engine or a predictor module thereof can … instantiate a number of trained neural networks based on the concepts learned by the number of neural networks in the one or more training cycles, and the AI engine can identify a best trained neural network (e.g., by means of optimal results based on factors such as performance time, accuracy, etc.) among the number of trained neural networks” [i.e., evaluating the neural network based on prediction accuracy]); transmitting, from the first host system to the second host system, a new model trained on data received from one or more remote locations (see, e.g., paragraphs 32, 34, 38, 58 and 80, “an AI engine hosted on one or more remote servers … The one or more AI engine modules can include an instructor module and a learner module configured to train an AI model”, “train the neural network 104 to provide a trained neural network 106, and deploy the trained neural network 106 as a deployed neural network 108”, “the training data source 214 can send the training data to the AI generator”, “[t]he AI engine can be a cloud-hosted platform-as-a-service … [t]hus, the AI engine can be accessible with one or more client-side interfaces … let the online AI engine build and generate a trained intelligence model for one or more of the third parties”, “the learning system to use for training … can run on a local client machine and stream data to the remote AI engine for training” [i.e., transmitting/deploying new model 108 trained on data received from remote locations and the new model from a 1st host system/server to a 2nd host system via interface]); training, at the second host system, the new model, customized as a function of … data at the second host system (see, e.g., paragraphs 34, 58 and 80, “deploy the trained neural network 106 as a deployed neural network 108”, “build and generate a trained intelligence model for one or more of the third parties” [i.e., the 2nd host system], “the data can be … passed to the learning system … for the learning system to use for training” [i.e., training the new model 108 as a function of training data at the 2nd host system]); evaluating, at the second host system, the new model (see, e.g., paragraphs 27, 77 and 128, “An ‘AI model’ as used herein includes … neural networks”, “The AI engine or a predictor module thereof can … instantiate a number of trained neural networks [i.e., including the new AI model] based on the concepts learned by the number of neural networks in the one or more training cycles, and the AI engine can identify a best trained neural network (e.g., by means of optimal results based on factors such as performance time, accuracy, etc.) among the number of trained neural networks” [i.e., evaluating the new model based on prediction accuracy], “the one or more client systems 210 of FIGS. 2A and 3A … can include … the software application or the hardware-based system in which the trained neural network 106 can be deployed.” [i.e., the new model can be deployed to and evaluated by the second host system 210]); and installing the new model at the second host system applied to … stream data3 (see, e.g., paragraphs 34, 36, 80, 82 and 91 “deploy the trained neural network 106 as a deployed neural network 108 … the trained AI model or the trained neural network 106 can be deployed in … a hardware-based system”, “included in the one or more server systems 220, or the training data source 214 can be include[d] in both the one or more client systems 210”, “run on a local client machine and stream data to the remote AI engine for training”, “Data can be streamed into the BRAIN server … the data can flow through the nodes in the BRAIN model”, “server can take a trained BRAIN model, enable API endpoints so that data can be streamed to and from the model” [i.e., deploying/installing the new model 108 at the 2nd host system 210, where the model is applied to stream data]). Although Hammond substantially discloses the claimed invention, Hammond does not explicitly disclose event data … from an event data stream, training a neural network, based on the … event data and evaluating … based on cross-validation as a function of data accessible only to the first host system. In the same field, analogous art Vu teaches event data … from an event data stream (see, e.g., FIG. 1 - “An intelligent audio-video surveillance system … using a priori knowledge and processing audio-video streams” and including “Video Event Detection” module, sections 1 and 2 “solutions for the automatic surveillance … by analyzing human behaviors based on audio-video stream interpretation”, “Figure 1 shows a near real-time intelligent audio-video surveillance system … composed of a knowledge base containing a priori knowledge and … modules: …Temporal Multi-Camera Analysis, … Video Event Detection and … Audio-Video Event Recognition”, “models are then used by the SAMSIT platform for interpreting audio-video streams” [i.e., event data from an event data stream]), training … based on the … event data (see, e.g., sections 4.2, 6.2 and 8, “face descriptors are used as classifiers … discriminative descriptors are searched by training … on a large database containing face and non-face samples”, “[e]ach model that appears as a component of the tree is computed during a training step … based on … parameters extracted from a training data corpus”, “audio-video event recognition aims at recognizing complex temporal events that combine both audio and video events. Those events are defined in the knowledge base” [i.e., training based on event data]), and evaluating … based on cross-validation as a function of data accessible only to the first host system (see, e.g., section 6.3 – “Results of Cross-Validation”, “Cross-validation aims at estimating how well the model we have learned from some training data [i.e., a trained model/neural network] is going to perform on future unknown data [i.e., as a function of data only accessible to the first host system] … the model is trained on all the training data except for one … the learned model is evaluated on the remaining data [i.e., evaluating the learned model/neural network]. Both steps are repeated such that each data is used once as the validation data. The evaluation process we achieved focuses not only on the good or bad detection of events, but also on the precision on the time scale of the detection”). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Hammond to incorporate the teachings of Vu to provide an audio-video surveillance system including modules for Face Detection and Tracking, Audio Event Detection and Audio-Video Scenario Recognition for automatic surveillance (See, e.g., Vu, Abstract). Doing so would have allowed Hammond to detect abnormal events which are precursors for detecting scenarios which have been predefined and to perform high level interpretation of observed objects by combining audio and video events based on spatio-temporal reasoning and to evaluate performance of the system for a series of pre-defined audio, video and audio-video events (i.e., training based on event data and evaluating based on cross-validation), as suggested by Vu (See, e.g., Vu, Abstract and section 6). Although Hammond in view of Vu substantially teaches the claimed invention, Hammond in view of Vu does not explicitly teach the new model, customized as a function of historical data at the second host system; and the new model at the second host system applied to live event stream data. In the same field, analogous art Gopalan teaches the new model, customized as a function of historical data at the second host system (see, e.g., paragraphs 23, 26 and 36, “prior to implementing machine learning application on actual/current network video traffic, the machine learning application is trained with historical network video traffic.”, “the neural network is trained with historical network video traffic. For example, a node may model a communication link … input modeling video traffic from media server 124 may be weighted”, “Retraining the machine learning application can include determining the multiple layers of the neural network of the machine learning application according to the re-training” [i.e., customizing/training the new neural network model as function of historical data at a host system/server]); and the new model at the second host system applied to live event stream data4 (see, e.g., paragraphs 38-39, “media server 122 may provide media content of a live event … media server 122 can receive real-time, live media content of a live concert to be streamed to users”, “media server 122 can increase its level 402, of network video traffic … the network manager 102 re-provisions the network resources … according to the re-trained machine learning application. Retraining the machine learning application can include determining the multiple layers of the neural network of the machine learning application according to the re-training based on the current network video traffic.” [i.e., the new, re-trained neural network model at the 2nd host system/server is applied to live event stream data]). Additionally or alternatively, in the same field, analogous art Gopalan also teaches event data … from an event data stream (see, e.g., paragraphs 3 and 38, “content producers may stream video of live events to users. Thus, communication networks manage and support streaming/downloading of stored video content from media content providers as well as support streaming video of live events.”, “media server 122 may provide media content of a live event some time in the future. For example, media server 122 can receive real-time, live media content of a live concert to be streamed to users”). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Gopalan with Hammond in view of Vu to provide a “network manager 102 [that] re-trains the machine learning application using the current network video traffic” where “Retraining the machine learning application can include determining the multiple layers of the neural network of the machine learning application according to the re-training based on the current network video traffic.” (See, e.g., Gopalan, paragraph 36). Doing so would have enabled Hammond in view of Vu to use Gopalan’s network manager and neural network of the machine learning application where “the network manager 102 observes or otherwise detects current network video traffic that does not conform to the historical network video traffic or data used in training the machine learning application” and “the network manager can adjust, as part of re-training the machine learning, one or more Markov logic state machines to include the one or more new statistical states or adjust the machine learning application” so that “Incorporating the Markov logic state machine with a neural network allows for a network manager to more robustly allocate network resources to efficiently carry the current network video traffic across communication networks”, as suggested by Gopalan (See, e.g., Gopalan, paragraphs 37 and 34). Regarding claim 4, as discussed above, Hammond in view of Vu and Gopalan teaches the method of claim 1. Hammond further discloses the training … data being filtered from the … data stream … in the neural network (see, e.g., paragraphs 61, 80 and 87, “The training data can be … streamed training data”, “the data can be optionally filtered/augmented in the lessons before being passed to the learning system. The simulator can use this data to configure itself, and the simulator can subsequently produce a piece of data for the learning system to use for training … simulation can run on a local client machine and stream data to the remote AI engine for training” [i.e., training data being filtered from the data stream], “provide one or more training status updates on training a neural network” [i.e., in the neural network]). Although Hammond substantially discloses the claimed invention, Hammond does not explicitly disclose event data being filtered from the event data stream as a function of a prediction of a contribution of each event in the event data stream to the reduction of error. In the same field, analogous art Vu teaches event data being filtered from the event data stream (see, e.g., FIG. 1 - “An intelligent audio-video surveillance system … using a priori knowledge and processing audio-video streams” and including “Video Event Detection” module, sections 1, 2 and 4.4-4.5 “solutions for the automatic surveillance … by analyzing human behaviors based on audio-video stream interpretation”, “Figure 1 shows a near real-time intelligent audio-video surveillance system … composed of a knowledge base containing a priori knowledge and … modules: …Temporal Multi-Camera Analysis, … Video Event Detection and … Audio-Video Event Recognition”, “models are then used by the SAMSIT platform for interpreting audio-video streams”, “to temporally associate the detection results, a tracking module filter … was implemented”, “false positive detections can be partly filtered out with the tracking algorithm” [i.e., event data filtered from an event data stream]) as a function of a prediction of a contribution of each event in the event data stream to the reduction of error (see, e.g., sections 4.5 and 6.1, “At each stage of the training process, 10000 negative samples were randomly selected from classification errors at the previous stage … The false positive rate after training was 1.1e-9. On a test sequence of 500 frames, we obtained a false positive rate equal to 1.7e-7 and a detection rate superior to 94%. Moreover, the false positive detections can be partly filtered out with the tracking algorithm”, “detecting the changes in the autoregressive models through the prediction errors computed on two analysis windows … [t]o … reduce the … false detection rate, we merge quasi-adjacent segments.” [i.e., filtered as a function of a prediction of contribution of each event to reduction of error/reduction of classification errors/false positives/false detections]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Hammond to incorporate the teachings of Vu to provide an audio-video surveillance system including modules for Face Detection and Tracking, Audio Event Detection and Audio-Video Scenario Recognition for automatic surveillance (See, e.g., Vu, Abstract). Doing so would have allowed Hammond to detect abnormal events which are precursors for detecting scenarios which have been predefined and to perform high level interpretation of observed objects by combining audio and video events based on spatio-temporal reasoning and to evaluate performance of the system for a series of pre-defined audio, video and audio-video events (i.e., training based on event data and evaluating based on cross-validation), as suggested by Vu (See, e.g., Vu, Abstract and section 6). Regarding claim 6, as discussed above, Hammond in view of Vu and Gopalan teaches the method of claim 4. Although Hammond substantially discloses the claimed invention, Hammond does not explicitly disclose the prediction of the contribution of each event in the event data stream being based on a determination of whether the prediction is correct or incorrect. In the same field, analogous art Vu teaches the prediction of the contribution of each event in the event data stream being based on a determination of whether the prediction is correct or incorrect (see, e.g., sections 4.2 and 6.1, “training … on a large database containing face and non-face samples. In the training algorithm, a decision stump is associated to each histogram component, and the classifiers are cascaded … [e]ach stage of the cascade was trained to perform 99.9% of positive detection and 50% of false alarm” [i.e., predict contribution of events based on whether a prediction is correct/positive detection or incorrect/false alarm], “detecting the changes in the autoregressive models through the prediction errors computed on two analysis windows”). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Hammond to incorporate the teachings of Vu to provide an audio-video surveillance system including modules for Face Detection and Tracking, Audio Event Detection and Audio-Video Scenario Recognition for automatic surveillance (See, e.g., Vu, Abstract). Doing so would have allowed Hammond to detect abnormal events which are precursors for detecting scenarios which have been predefined and to perform high level interpretation of observed objects by combining audio and video events based on spatio-temporal reasoning and to evaluate performance of the system for a series of pre-defined audio, video and audio-video events (i.e., training based on event data and evaluating based on cross-validation), as suggested by Vu (See, e.g., Vu, Abstract and section 6). Regarding independent claim 8, Hammond discloses the invention as claimed including a method to train a neural network (see, e.g., paragraphs 27, 32, and 34, “An ‘AI model’ as used herein includes, but is not limited to, neural networks”, “AI-engine modules can include … a learner module configured to train an AI model”, “methods provided herein … train the AI model to provide a trained AI model”), comprising: sending from a first host system, a first neural network to a second host system (see, e.g., paragraphs 32, 36, 38 and 58, “an AI engine hosted on one or more remote servers” [i.e., including a first host system/server], “AI system 200 includes one or more client systems 210” [i.e., including a second host system], “compiler 222 can send the compiled code or assembly code to the AI generator 223, which … builds a neural network such as the neural network 104” [i.e., a first neural network], “[t]he AI engine can be a cloud-hosted platform-as-a-service … [t]hus, the AI engine can be accessible with one or more client-side interfaces … let the online AI engine build and generate a trained intelligence model for one or more of the third parties [i.e., send first neural network 104 to a third party/second host system via client-side interface]); training, by the second host system, a second neural network as a function of the first neural network and … data stream accessible to the second host system (see, e.g., paragraphs 42, 61, 77 and 80, “Training the neural network 104 can take place in one or more training cycles to yield a trained state of the neural network or the trained neural network 106” [i.e., a second neural network 106 trained as a function of the first neural network 104], “[t]he AI engine … can … instantiate a number of trained neural networks [i.e., including a second neural network] based on the concepts learned by the number of neural networks in the one or more training cycles” [i.e., training the second neural network as a function of the first neural network], “AI system 500 can include a training data loader 521 configured to load training data … and a streaming data server 523. The training data can be … streamed training data”, “the data can be … passed to the learning system … for the learning system to use for training” [i.e., training as a function of a data stream accessible to the second host system]); evaluating, at the second host system, the second neural network against a former baseline neural network utilized by the second host system (see, e.g., paragraphs 34, 38, 77 and 128, “train the neural network 104 to provide a trained neural network 106” [i.e., a baseline neural network 104 is trained], “AI system 200 includes one or more client systems 210” [i.e., 2nd host system], “The AI generator 223 can elicit a prediction from the trained neural network 106 and send the prediction”, “The AI engine or a predictor module thereof can … instantiate a number of trained neural networks [i.e., including the 2nd neural network] based on the concepts learned by the number of neural networks in the one or more training cycles, and the AI engine can identify a best trained neural network (e.g., by means of optimal results based on factors such as performance time, accuracy, etc.) among the number of trained neural networks”, “the one or more client systems 210 … can include … the software application or the hardware-based system in which the trained neural network 106 can be deployed.” [i.e., evaluating the 2nd neural network 106 at 2nd host system 210 based on prediction accuracy); and installing the second neural network on said second host system … the second neural network performs more accurately than the baseline neural network5 (see, e.g., paragraphs 128, 77 and 102, “the one or more client systems 210 … can include … the software application or the hardware-based system in which the trained neural network 106 can be deployed.” [i.e., installing/deploying the 2nd neural network 106 on the 2nd host system/210], “the AI engine can identify a best trained neural network (e.g., by means of optimal results based on factors such as performance time, accuracy, etc.) among the number of trained neural networks.”, “the AI engine can include a meta-learning module configured to keep a record … The record can include … how quickly the trained neural networks were trained to a sufficient level of accuracy, and vi) how accurate the trained neural networks became in making predictions on the training data.” [i.e., the identified/trained 2nd neural network performs more accurately than the baseline neural network 104]). Although Hammond substantially discloses the claimed invention, Hammond does not explicitly disclose an event data stream and upon validation the second neural network performs more accurately than the baseline neural network. In the same field, analogous art Vu teaches an event data stream (see, e.g., FIG. 1 - “An intelligent audio-video surveillance system … using a priori knowledge and processing audio-video streams” and including “Video Event Detection” module, sections 1 and 2 “solutions for the automatic surveillance … by analyzing human behaviors based on audio-video stream interpretation”, “Figure 1 shows a near real-time intelligent audio-video surveillance system … composed of a knowledge base containing a priori knowledge and … modules: …Temporal Multi-Camera Analysis, … Video Event Detection and … Audio-Video Event Recognition”, “models are then used by the SAMSIT platform for interpreting audio-video streams” [i.e., an event data stream]) and upon validation the second neural network performs more accurately than the baseline neural network6 (see, e.g., section 6.3 – “Results of Cross-Validation”, “Cross-validation aims at estimating how well the model we have learned from some training data [i.e., estimate accuracy of the trained 2nd model/neural network] is going to perform on future unknown data … the model is trained on all the training data except for one … the learned model is evaluated on the remaining data [i.e., validation of the 2nd learned model/neural network]. Both steps are repeated such that each data is used once as the validation data. The evaluation process we achieved focuses not only on the good or bad detection of events, but also on the precision on the time scale of the detection. The results are then expressed as correctly identified durations and misidentified durations.” [i.e., the validated, 2nd network performs with more precision/more accurately, with more correct identifications than the baseline network]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Hammond to incorporate the teachings of Vu to provide an audio-video surveillance system including modules for Face Detection and Tracking, Audio Event Detection and Audio-Video Scenario Recognition for automatic surveillance (See, e.g., Vu, Abstract). Doing so would have allowed Hammond to detect abnormal events which are precursors for detecting scenarios which have been predefined and to perform high level interpretation of observed objects by combining audio and video events based on spatio-temporal reasoning and to evaluate performance of the system for a series of pre-defined audio, video and audio-video events (i.e., training based on event data and evaluating based on cross-validation), as suggested by Vu (See, e.g., Vu, Abstract and section 6). Additionally or alternatively, in the same field, analogous art Gopalan also teaches an event data stream (see, e.g., paragraphs 3 and 38, “content producers may stream video of live events to users. Thus, communication networks manage and support streaming/downloading of stored video content from media content providers as well as support streaming video of live events.”, “media server 122 may provide media content of a live event some time in the future. For example, media server 122 can receive real-time, live media content of a live concert to be streamed to users”). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Gopalan with Hammond in view of Vu to provide a “network manager 102 [that] re-trains the machine learning application using the current network video traffic” where “Retraining the machine learning application can include determining the multiple layers of the neural network of the machine learning application according to the re-training based on the current network video traffic.” (See, e.g., Gopalan, paragraph 36). Doing so would have enabled Hammond in view of Vu to use Gopalan’s network manager and neural network of the machine learning application where “the network manager 102 observes or otherwise detects current network video traffic that does not conform to the historical network video traffic or data used in training the machine learning application” and “the network manager can adjust, as part of re-training the machine learning, one or more Markov logic state machines to include the one or more new statistical states or adjust the machine learning application” so that “Incorporating the Markov logic state machine with a neural network allows for a network manager to more robustly allocate network resources to efficiently carry the current network video traffic across communication networks”, as suggested by Gopalan (See, e.g., Gopalan, paragraphs 37 and 34). Regarding claim 9, as discussed above, Hammond in view of Vu and Gopalan teaches the method of claim 8. Hammond further discloses the first host system being centralized to send a first neural network to a plurality of homogeneous second host systems (see, e.g., FIG. 2A – central sever system 220 receives training data from client systems 210 with training data sources 214 and paragraphs 36, 38 and 58, “AI system 200 includes one or more client systems 210 [i.e., homogeneous second host systems] and one or more server systems 220” [i.e., a centralized first host system], “compiler 222 can send the compiled code or assembly code to the AI generator 223, which … builds a neural network such as the neural network 104” [i.e., a first neural network], “[t]he AI engine can be a cloud-hosted platform-as-a-service … [t]hus, the AI engine can be accessible with one or more client-side interfaces … let the online AI engine build and generate a trained intelligence model for one or more of the third parties [i.e., send first neural network 104 to third parties/second host systems via client-side interfaces]). Regarding claim 10, as discussed above, Hammond in view of Vu and Gopalan teaches the method of claim 8. Hammond further discloses the first neural network is a baseline neural network trained … based on data received from a plurality of homogeneous second host systems (see, e.g., FIG. 2A –system 220 receives training data from client systems 210 with training data sources 214 and paragraphs 34, 36 and 77, “train the neural network 104 to provide a trained neural network 106” [i.e., a baseline neural network is trained], “AI system 200 includes one or more client systems 210 [i.e., homogeneous second host systems] … [t]he one or more client systems 210 can further include a training data source 214”, and “neural networks can be trained in one or more training cycles with the training data from one or more training data sources.” [i.e., data received from homogenous second host systems/client systems 210]). Although Hammond substantially discloses the claimed invention, Hammond does not explicitly disclose the first neural network is … cross-validated based on data received. In the same field, analogous art Vu teaches the first neural network is … cross-validated based on data received (see, e.g., section 6.3 – “Results of Cross-Validation”, “Cross-validation aims at estimating how well the model we have learned from some training data [i.e., a trained model/neural network] is going to perform on future unknown data … the model is trained on all the training data [i.e., data received] except for one … the learned model is evaluated on the remaining data [i.e., the learned model/neural network being cross-validated]. Both steps are repeated such that each data is used once as the validation data. The evaluation process we achieved focuses not only on the good or bad detection of events, but also on the precision on the time scale of the detection”). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Hammond to incorporate the teachings of Vu to provide an audio-video surveillance system including modules for Face Detection and Tracking, Audio Event Detection and Audio-Video Scenario Recognition for automatic surveillance (See, e.g., Vu, Abstract). Doing so would have allowed Hammond to detect abnormal events which are precursors for detecting scenarios which have been predefined and to perform high level interpretation of observed objects by combining audio and video events based on spatio-temporal reasoning and to evaluate performance of the system for a series of pre-defined audio, video and audio-video events (i.e., training based on event data and evaluating based on cross-validation), as suggested by Vu (See, e.g., Vu, Abstract and section 6). Regarding claim 13, as discussed above, Hammond in view of Vu and Gopalan teaches the method of claim 8. Hammond further discloses training, by the second host system, a third neural network as a function of the second neural network (see, e.g., paragraphs 34 and 77, “build a neural network 104, train the neural network 104 to provide a trained neural network 106 [i.e., the second neural network], and deploy the trained neural network 106 as a deployed neural network 108” [i.e., the third neural network deployed as a function of the second neural network 106], “[t]he number of neural networks can be trained [i.e., including the third neural network] in one or more training cycles with the training data from one or more training data sources” [i.e., the third neural network trained as a function of the second neural network]). Regarding claim 14, as discussed above, Hammond in view of Vu and Gopalan teaches the method of claim 8. Hammond further discloses evaluating, by the second host system, the second neural network (see, e.g., paragraphs 77 and 128, “The AI engine or a predictor module thereof can … instantiate a number of trained neural networks [i.e., including the second neural network] based on the concepts learned by the number of neural networks in the one or more training cycles, and the AI engine can identify a best trained neural network (e.g., by means of optimal results based on factors such as performance time, accuracy, etc.) among the number of trained neural networks” [i.e., evaluating the second neural network based on prediction accuracy], “the one or more client systems 210 of FIGS. 2A and 3A … can include … the software application or the hardware-based system in which the trained neural network 106 can be deployed.” [i.e., the second neural network 106 can be deployed to and evaluated by the second host system 210]). Although Hammond substantially discloses the claimed invention, Hammond does not explicitly disclose evaluating ... based on cross-validation as a function of data accessible only to the second host system. In the same field, analogous art Vu teaches evaluating ... based on cross-validation as a function of data accessible only to the second host system (see, e.g., section 6.3 – “Results of Cross-Validation”, “Cross-validation aims at estimating how well the model we have learned from some training data [i.e., a trained model/neural network] is going to perform on future unknown data … the model is trained on all the training data except for one [i.e., as a function of data only accessible to the second host system] … the learned model is evaluated on the remaining data [i.e., evaluating the learned model/neural network]. Both steps are repeated such that each data is used once as the validation data. The evaluation process we achieved focuses not only on the good or bad detection of events, but also on the precision on the time scale of the detection”). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Hammond to incorporate the teachings of Vu to provide an audio-video surveillance system including modules for Face Detection and Tracking, Audio Event Detection and Audio-Video Scenario Recognition for automatic surveillance (See, e.g., Vu, Abstract). Doing so would have allowed Hammond to detect abnormal events which are precursors for detecting scenarios which have been predefined and to perform high level interpretation of observed objects by combining audio and video events based on spatio-temporal reasoning and to evaluate performance of the system for a series of pre-defined audio, video and audio-video events (i.e., training based on event data and evaluating based on cross-validation), as suggested by Vu (See, e.g., Vu, Abstract and section 6). Regarding claim 15, as discussed above, Hammond in view of Vu and Gopalan teaches the method of claim 8. Hammond further discloses the second host system generating a prediction determined as a function of the second neural network and input data (see, e.g., paragraph 36 and 38, “client systems 210 can further include a training data source 214” [i.e., second host system 210], “training data source 214 can send the training data to the AI generator 223 [i.e., input data] … AI generator 223 can subsequently train the neural network 104 on the training data in one or more training cycles to provide a trained state of the neural network or the trained neural network 106 [i.e., second neural network]. The AI generator 223 can elicit a prediction from the trained neural network 106” [i.e., generate a prediction as a function of the second neural network 106 and input data]). Regarding independent claim 16, Hammond discloses the invention as claimed including a computerized system for training a neural network (see, e.g., paragraphs 27, 32, and 34, “An ‘AI model’ as used herein includes, but is not limited to, neural networks”, “AI-engine modules can include … a learner module configured to train an AI model”, “systems … provided herein … train the AI model to provide a trained AI model”), said system comprising: a first host system, comprising a processor, a communications means and a training event module stored in non-transitory memory and wherein the first host system is configured to instruct said processor (see, e.g., FIG. 9 and paragraphs 32, 142 and 145, “an AI engine hosted on one or more remote servers” [i.e., including a first host system/server], “computing system 900 can include a processor 920 [i.e., a processor], a memory (e.g., ROM 931, RAM 932, etc.) [i.e., non-transitory memory] … built-in Wi-Fi circuitry to wirelessly communicate with a remote computing device connected to network” [i.e., communication means], “software used to facilitate algorithms discussed herein can be embodied onto a nontransitory machine-readable medium [i.e., module stored in non-transitory memory] … that stores information in a form readable by a machine (e.g., a computer) … or any type of media suitable for storing electronic instructions” [i.e., 1st host system is configured to instruct the processor]) to: receive, by said first host system, training … data filtered from … data stream by a second host system (see, e.g., paragraphs 6, 32, 36, 46, 61 and 80, “The AI system can further include one or more training data sources configured to provide training data, wherein the one or more training data sources includes at least one server-side training data source or at least one client-side training data source configured to provide the training data”, “an AI engine hosted on one or more remote servers” [i.e., including a first host system/server], “lessons for training the AI model … configured to optionally use a different flow of the training data”, “AI system 200 includes one or more client systems 210 [i.e., including a second host system] and one or more server systems 220 … client systems 210 can further include a training data source 214” [i.e., receive, by 1st host system, training data from a second host system/training data source 214], “one or more data transformation streams”, “AI system 500 can include a training data loader 521 configured to load training data … and a streaming data server 523. The training data can be … streamed training data”, “the data can be optionally filtered/augmented in the lessons before being passed to the learning system … subsequently produce a piece of data for the learning system to use for training” [i.e., filtered training data from a data stream]); train, by said first host system, a neural network, based on the filtered training … data (see, e.g., paragraphs 34, 36 and 38, “train the neural network 104 to provide a trained neural network 106”, “one or more server systems 220 can be remote server systems and include … an AI generator 223 for generating the trained neural network 106”, “The AI generator can request training data from the training data source 214, and the training data source 214 can send the training data to the AI generator 223 … AI generator 223 can subsequently train the neural network 104 on the training data … to provide a trained state of the neural network or the trained neural network 106” [i.e., train a neural network based on the filtered training data]); and evaluate, by said first host system, the neural network (see, e.g., paragraphs 38 and 77, “The AI generator 223 can elicit a prediction from the trained neural network 106 and send the prediction”, “The AI engine or a predictor module thereof can … instantiate a number of trained neural networks based on the concepts learned by the number of neural networks in the one or more training cycles, and the AI engine can identify a best trained neural network (e.g., by means of optimal results based on factors such as performance time, accuracy, etc.) among the number of trained neural networks” [i.e., evaluate the neural network based on prediction accuracy]); transmit, from the first host system to the second host system, a new model trained on data received from one or more remote locations (see, e.g., paragraphs 32, 34, 38, 58 and 80, “an AI engine hosted on one or more remote servers … The one or more AI engine modules can include an instructor module and a learner module configured to train an AI model”, “train the neural network 104 to provide a trained neural network 106, and deploy the trained neural network 106 as a deployed neural network 108”, “the training data source 214 can send the training data to the AI generator”, “[t]he AI engine can be a cloud-hosted platform-as-a-service … [t]hus, the AI engine can be accessible with one or more client-side interfaces … let the online AI engine build and generate a trained intelligence model for one or more of the third parties”, “the learning system to use for training … can run on a local client machine and stream data to the remote AI engine for training” [i.e., transmit/deploy new model 108 trained on data received from remote locations and the new model from a 1st host system/server to a 2nd host system via interface]); train, at the second host system, the new model, customized as a function of … data at the second host system (see, e.g., paragraphs 34, 58 and 80, “deploy the trained neural network 106 as a deployed neural network 108”, “build and generate a trained intelligence model for one or more of the third parties” [i.e., the 2nd host system], “the data can be … passed to the learning system … for the learning system to use for training” [i.e., train the new model 108 as a function of training data at the 2nd host system]); evaluate, at the second host system, the new model (see, e.g., paragraphs 27, 77 and 128, “An ‘AI model’ as used herein includes … neural networks”, “The AI engine or a predictor module thereof can … instantiate a number of trained neural networks [i.e., including the new AI model] based on the concepts learned by the number of neural networks in the one or more training cycles, and the AI engine can identify a best trained neural network (e.g., by means of optimal results based on factors such as performance time, accuracy, etc.) among the number of trained neural networks” [i.e., evaluating the new model based on prediction accuracy], “the one or more client systems 210 of FIGS. 2A and 3A … can include … the software application or the hardware-based system in which the trained neural network 106 can be deployed.” [i.e., the new model can be deployed to and evaluated by the second host system 210]); and install the new model at the second host system for application to … stream data7 (see, e.g., paragraphs 34, 36, 80, 82 and 91 “deploy the trained neural network 106 as a deployed neural network 108 … the trained AI model or the trained neural network 106 can be deployed in … a hardware-based system”, “included in the one or more server systems 220, or the training data source 214 can be include[d] in both the one or more client systems 210”, “run on a local client machine and stream data to the remote AI engine for training”, “Data can be streamed into the BRAIN server … the data can flow through the nodes in the BRAIN model”, “server can take a trained BRAIN model, enable API endpoints so that data can be streamed to and from the model” [i.e., deploy/install the new model 108 at the 2nd host system 210, where the model is applied to stream data]). Although Hammond substantially discloses the claimed invention, Hammond does not explicitly disclose event data … from an event data stream, training a neural network, based on the … event data and evaluate … based on cross-validation as a function of data accessible only to the first host system. In the same field, analogous art Vu teaches event data … from an event data stream (see, e.g., FIG. 1 - “An intelligent audio-video surveillance system … using a priori knowledge and processing audio-video streams” and including “Video Event Detection” module, sections 1 and 2 “solutions for the automatic surveillance … by analyzing human behaviors based on audio-video stream interpretation”, “Figure 1 shows a near real-time intelligent audio-video surveillance system … composed of a knowledge base containing a priori knowledge and … modules: …Temporal Multi-Camera Analysis, … Video Event Detection and … Audio-Video Event Recognition”, “models are then used by the SAMSIT platform for interpreting audio-video streams” [i.e., event data from an event data stream]), training … based on the … event data (see, e.g., sections 4.2, 6.2 and 8, “face descriptors are used as classifiers … discriminative descriptors are searched by training … on a large database containing face and non-face samples”, “[e]ach model that appears as a component of the tree is computed during a training step … based on … parameters extracted from a training data corpus”, “audio-video event recognition aims at recognizing complex temporal events that combine both audio and video events. Those events are defined in the knowledge base” [i.e., training based on event data]), and evaluate … based on cross-validation as a function of data accessible only to the first host system (see, e.g., section 6.3 – “Results of Cross-Validation”, “Cross-validation aims at estimating how well the model we have learned from some training data [i.e., a trained model/neural network] is going to perform on future unknown data [i.e., as a function of data only accessible to the first host system] … the model is trained on all the training data except for one … the learned model is evaluated on the remaining data [i.e., evaluate the learned model/neural network]. Both steps are repeated such that each data is used once as the validation data. The evaluation process we achieved focuses not only on the good or bad detection of events, but also on the precision on the time scale of the detection”). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Hammond to incorporate the teachings of Vu to provide an audio-video surveillance system including modules for Face Detection and Tracking, Audio Event Detection and Audio-Video Scenario Recognition for automatic surveillance (See, e.g., Vu, Abstract). Doing so would have allowed Hammond to detect abnormal events which are precursors for detecting scenarios which have been predefined and to perform high level interpretation of observed objects by combining audio and video events based on spatio-temporal reasoning and to evaluate performance of the system for a series of pre-defined audio, video and audio-video events (i.e., training based on event data and evaluating based on cross-validation), as suggested by Vu (See, e.g., Vu, Abstract and section 6). Although Hammond in view of Vu substantially teaches the claimed invention, Hammond in view of Vu does not explicitly teach a new model, customized as a function of historical data at the second host system; and the new model at the second host system for application to live event stream data. In the same field, analogous art Gopalan teaches a new model, customized as a function of historical data at the second host system (see, e.g., paragraphs 23, 26 and 36, “prior to implementing machine learning application on actual/current network video traffic, the machine learning application is trained with historical network video traffic.”, “the neural network is trained with historical network video traffic. For example, a node may model a communication link … input modeling video traffic from media server 124 may be weighted”, “Retraining the machine learning application can include determining the multiple layers of the neural network of the machine learning application according to the re-training” [i.e., customizing/training the new neural network model as function of historical data at a host system/server]); and the new model at the second host system for application to live event stream data8 (see, e.g., paragraphs 38-39, “media server 122 may provide media content of a live event … media server 122 can receive real-time, live media content of a live concert to be streamed to users”, “media server 122 can increase its level 402, of network video traffic … the network manager 102 re-provisions the network resources … according to the re-trained machine learning application. Retraining the machine learning application can include determining the multiple layers of the neural network of the machine learning application according to the re-training based on the current network video traffic.” [i.e., the new, re-trained neural network model at the 2nd host system/server is applied to live event stream data]). Additionally or alternatively, in the same field, analogous art Gopalan also teaches event data … from an event data stream (see, e.g., paragraphs 3 and 38, “content producers may stream video of live events to users. Thus, communication networks manage and support streaming/downloading of stored video content from media content providers as well as support streaming video of live events.”, “media server 122 may provide media content of a live event some time in the future. For example, media server 122 can receive real-time, live media content of a live concert to be streamed to users”). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Gopalan with Hammond in view of Vu to provide a “network manager 102 [that] re-trains the machine learning application using the current network video traffic” where “Retraining the machine learning application can include determining the multiple layers of the neural network of the machine learning application according to the re-training based on the current network video traffic.” (See, e.g., Gopalan, paragraph 36). Doing so would have enabled Hammond in view of Vu to use Gopalan’s network manager and neural network of the machine learning application where “the network manager 102 observes or otherwise detects current network video traffic that does not conform to the historical network video traffic or data used in training the machine learning application” and “the network manager can adjust, as part of re-training the machine learning, one or more Markov logic state machines to include the one or more new statistical states or adjust the machine learning application” so that “Incorporating the Markov logic state machine with a neural network allows for a network manager to more robustly allocate network resources to efficiently carry the current network video traffic across communication networks”, as suggested by Gopalan (See, e.g., Gopalan, paragraphs 37 and 34). Regarding claims 2 and 17, as discussed above, Hammond in view of Vu and Gopalan teaches the method of claim 1 and the system of claim 16. Hammond further discloses the first host system being centralized to receive training … data from a plurality of homogeneous second host systems (see, e.g., FIG. 2A – central sever system 220 receives training data from client systems 210 with training data sources 214 and paragraphs 36 and 77, “AI system 200 includes one or more client systems 210 [i.e., homogeneous second host systems] and one or more server systems 220 [i.e., a centralized first host system] … [t]he one or more client systems 210 can further include a training data source 214”, “neural networks can be trained in one or more training cycles with the training data from one or more training data sources.” [i.e., receiving training data from homogenous second host systems/client systems 210]). Although Hammond substantially discloses the claimed invention, Hammond does not explicitly disclose event data. In the same field, analogous art Vu teaches event data (see, e.g., FIG. 1 - “An intelligent audio-video surveillance system” including “Video Event Detection” module, Abstract and section 1, “The Audio-Video Scenario Recognition module performs high level interpretation of the observed objects by combining audio and video events” [i.e., event data], “Figure 1 shows a near real-time intelligent audio-video surveillance system … composed of … modules: … Video Event Detection and … Audio-Video Event Recognition”). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Hammond to incorporate the teachings of Vu to provide an audio-video surveillance system including modules for Face Detection and Tracking, Audio Event Detection and Audio-Video Scenario Recognition for automatic surveillance (See, e.g., Vu, Abstract). Doing so would have allowed Hammond to detect abnormal events which are precursors for detecting scenarios which have been predefined and to perform high level interpretation of observed objects by combining audio and video events based on spatio-temporal reasoning and to evaluate performance of the system for a series of pre-defined audio, video and audio-video events (i.e., training based on event data and evaluating based on cross-validation), as suggested by Vu (See, e.g., Vu, Abstract and section 6). Regarding claims 3 and 18, as discussed above, Hammond in view of Vu and Gopalan teaches the method of claim 1 and the system of claim 16. Hammond further discloses the second host system being remote from the first host system (see, e.g., FIG. 2A - client system 210 is remote from sever system 220 and paragraphs 32, 36 and 38, “an AI engine hosted on one or more remote servers”, “one or more server systems 220 can be remote server systems”, “a local client of the one or more clients 210 can send code … to the compiler 222 on a server such as a remote server of the one or more server systems 220” [i.e., second host system 210 is remote from first host system 220]). Regarding claim 19, Hammond in view of Vu and Gopalan teaches the system of claim 16. Hammond further discloses filter the training … data from the … data stream ... in the neural network (see, e.g., paragraphs 61, 80 and 87, “The training data can be … streamed training data”, “the data can be optionally filtered/augmented in the lessons before being passed to the learning system. The simulator can use this data to configure itself, and the simulator can subsequently produce a piece of data for the learning system to use for training … simulation can run on a local client machine and stream data to the remote AI engine for training” [i.e., filter the training data from the data stream], “provide one or more training status updates on training a neural network” [i.e., in the neural network]). Although Hammond substantially discloses the claimed invention, Hammond does not explicitly disclose filter the training event data from the event data stream as a function of a prediction of the contribution of each event in the event data stream to the reduction of error … ; and wherein the prediction of the contribution of each event in the event data stream is based at least in part on a determination of whether the prediction is correct or incorrect. In the same field, analogous art Vu teaches filter the training event data from the event data stream (see, e.g., FIG. 1 - “An intelligent audio-video surveillance system … using a priori knowledge and processing audio-video streams” and including “Video Event Detection” module, sections 1, 2 and 4.4-4.5 “solutions for the automatic surveillance … by analyzing human behaviors based on audio-video stream interpretation”, “Figure 1 shows a near real-time intelligent audio-video surveillance system … composed of a knowledge base containing a priori knowledge and … modules: …Temporal Multi-Camera Analysis, … Video Event Detection and … Audio-Video Event Recognition”, “models are then used by the SAMSIT platform for interpreting audio-video streams”, “to temporally associate the detection results, a tracking module filter … was implemented”, “false positive detections can be partly filtered out with the tracking algorithm” [i.e., event data filtered from an event data stream]) as a function of a prediction of the contribution of each event in the event data stream to the reduction of error (see, e.g., sections 4.5 and 6.1, “At each stage of the training process, 10000 negative samples were randomly selected from classification errors at the previous stage … The false positive rate after training was 1.1e-9. On a test sequence of 500 frames, we obtained a false positive rate equal to 1.7e-7 and a detection rate superior to 94%. Moreover, the false positive detections can be partly filtered out with the tracking algorithm”, “detecting the changes in the autoregressive models through the prediction errors computed on two analysis windows … [t]o … reduce the … false detection rate, we merge quasi-adjacent segments.” [i.e., filtered as a function of a prediction of contribution of each event to reduction of error/reduction of classification errors/false positives/false detections]) … ; and wherein the prediction of the contribution of each event in the event data stream is based at least in part on a determination of whether the prediction is correct or incorrect (see, e.g., sections 4.2 and 6.1, “training … on a large database containing face and non-face samples. In the training algorithm, a decision stump is associated to each histogram component, and the classifiers are cascaded … [e]ach stage of the cascade was trained to perform 99.9% of positive detection and 50% of false alarm” [i.e., prediction of contribution of events is based at least in part on whether a prediction is correct/positive detection or incorrect/false alarm], “detecting the changes in the autoregressive models through the prediction errors computed on two analysis windows”). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Hammond to incorporate the teachings of Vu to provide an audio-video surveillance system including modules for Face Detection and Tracking, Audio Event Detection and Audio-Video Scenario Recognition for automatic surveillance (See, e.g., Vu, Abstract). Doing so would have allowed Hammond to detect abnormal events which are precursors for detecting scenarios which have been predefined and to perform high level interpretation of observed objects by combining audio and video events based on spatio-temporal reasoning and to evaluate performance of the system for a series of pre-defined audio, video and audio-video events (i.e., training based on event data and evaluating based on cross-validation), as suggested by Vu (See, e.g., Vu, Abstract and section 6). Regarding claims 5, 11, and 20, as discussed above, as discussed above, Hammond in view of Vu and Gopalan teaches the method of claim 1, the method of claim 8, and the system of claim 16. Although Hammond substantially discloses the claimed invention, Hammond does not explicitly disclose the event data stream being based on data from a camera. In the same field, analogous art Vu teaches the event data stream being based on data from a camera (see, e.g., FIG. 1 - “An intelligent audio-video surveillance system” including “Face Detection and Tracking” and “Temporal Multi-Camera Analysis” modules, Abstract, sections 1-2 and 4-5, “Face Detection and Tracking module is responsible for detecting and tracking faces of people in front of cameras”, “Figure 1 shows a near real-time intelligent audio-video surveillance system … composed of … modules: … Face Detection and Tracking … Temporal Multi-Camera Analysis, … Video Event Detection”, “models are then used by the SAMSIT platform for interpreting audio-video streams” [i.e., event data stream], “calibration matrices of the cameras allow the SAMSIT platform to calculate for all detected mobile objects their 3D positions in the real world from their 2D positions in the images”, “faces can be seen in the field of view of one of the cameras”, “information coming from several cameras” [i.e., based on data from a camera]). Claims 7 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Hammond in view of Vu and Gopalan as applied to claim 4 above and further in view of Ellenbogen et al. (U.S. Patent Application Pub No. 2017/0099200 A1, hereinafter “Ellenbogen”). Ellenbogen was filed on October 6, 2016 and claims priority to U.S. Provisional application No. 62/237,733, filed on October 6, 2015, and both of these dates are before the effective filing date of this application, i.e., April 19, 2017. Therefore, Ellenbogen constitutes prior art under 35 U.S.C. 102(a)(2). Regarding claim 7, as discussed above, Hammond in view of Vu and Gopalan teaches the method of claim 4. Although Hammond in view of Vu and Gopalan substantially teaches the claimed invention, Hammond in view of Vu and Gopalan does not explicitly teach the error in the neural network being determined based on a logistic regression. In the same field, analogous art Ellenbogen teaches the error in the neural network being determined based on a logistic regression (see, e.g., paragraph 148, “where the machine computation component is a convolutional neural net, the convolutional neural network's last layer can be a logistic regression layer [i.e., a logistic regression], which classifies image patches into labels. During the training phase this value can be set to 1 for positive examples and to 0 for negative examples” [i.e., the error in the neural network]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Ellenbogen with Hammond in view of Vu and Gopalan to provide an analysis platform including predictive models built using a machine learning algorithm such as a deep learning neural network where the analysis platform can be applied to deployment types including closed circuit television, surveillance cameras and retail cameras (i.e., event data streams) (See, e.g., Ellenbogen, paragraphs 53-54). Doing so would have enabled Hammond in view of Vu and Gopalan to improve machine learning performance and machine decision making by reducing false alarms, reducing or eliminating false-negative or false-positive alerts (i.e., reduction of error in the neural network), as suggested by Ellenbogen (See, e.g., Ellenbogen, paragraphs 53 and 56). Regarding claim 12, as discussed above, Hammond in view of Vu and Gopalan teaches the method of claim 8. Although Hammond in view of Vu and Gopalan substantially teaches the claimed invention, Hammond in view of Vu and Gopalan does not explicitly teach training, by the second host system, the second neural network, being based on a technique selected from the group consisting of forward propagation, and backward propagation. In the same field, analogous art Ellenbogen teaches training, by the second host system, the second neural network, being based on a technique selected from the group consisting of forward propagation, and backward propagation (see, e.g., paragraphs 99 and 167, “Back propagation, expectation-maximization, and other machine learning type classifiers are possible. Features can include a back propagation like weighting system to rate agents”, “labeled bounding boxes and object labels are used to guide the process of finding the correct weights using a backpropagation algorithm” [i.e., training the neural second network based on backward propagation/back propagation]). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine Ellenbogen with Hammond in view of Vu and Gopalan to provide an analysis platform including predictive models built using a machine learning algorithm such as a deep learning neural network where the analysis platform can be applied to deployment types including closed circuit television, surveillance cameras and retail cameras (i.e., event data streams) (See, e.g., Ellenbogen, paragraphs 53-54). Doing so would have enabled Hammond in view of Vu and Gopalan to improve machine learning performance and machine decision making by reducing false alarms, reducing or eliminating false-negative or false-positive alerts (i.e., reduction of error in the neural network), as suggested by Ellenbogen (See, e.g., Ellenbogen, paragraphs 53 and 56). Conclusion Applicant's amendment necessitated the new grounds of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. The prior art made of record, listed on the accompanying PTO-892 Notice of References Cited form, and not relied upon is considered pertinent to applicant's disclosure. For example, Roy et al. (U.S. Patent Application Pub. No. 2021/0089829 A1, hereinafter “Roy”) discloses “streaming data, often times received and processed in real-time, or near real-time, is processed sequentially and incrementally on a record-by-record basis, or over sliding time windows, and used for a wide variety of analytics including correlations, aggregations, filtering … applying machine learning algorithms, and extract deeper insights from the data. Stored data, by contrast, is historical” and “the method 500 receives a volume of data, for example, a stream of high dimensional data at 505, or from a store of historical data, and trains the data to create training examples” [i.e., real-time/live event data stream, filtering data, and training a model as a function of historical data] (see, e.g., paragraphs 11 and 15). The examiner requests, in response to this office action, support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line no(s) in the specification and/or drawing figure(s). This will assist the examiner in prosecuting the application. When responding to this office action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of the art disclosed by the reference cited or the objections made. He or she must also show how the amendments avoid such references or objections See 37 CFR 1.111 (c). Any inquiry concerning this communication or earlier communications from the examiner should be directed to RANDY K BALDWIN whose telephone number is (571)270-5222. The examiner can normally be reached on Mon - Fri 9:00-6:00. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached on 571-272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /RANDALL K. BALDWIN/Primary Examiner, Art Unit 2125 1 As noted in the petition to revive decision dated January 2, 2025, the present application became abandoned April 24, 2020 for failure to timely submit a proper reply to the non-final Office action issued January 23, 2020 (hereinafter “the previous Office Action”). The previous Office Action set a three-month shortened statutory period of time for reply. No extensions of time under 37 CFR 1.136(a) were procured. The Notice of Abandonment was mailed August 24, 2020. 2 Examiner notes that, contrary to applicant’s assertions vis-à-vis independent claim 8 containing “amendments that are similar to those in Claim 1.” (applicant’s remarks, page 9), while claim 8 is amended, it does not recite the above-noted limitations relied upon in applicant’s arguments vis-à-vis claim 1. 3 As noted above in the section 112(b) rejection of this claim, “installing the new model at the second host system applied to … stream data” has been interpreted as “installing the new model at the second host system, wherein the new model is applied to … stream data.” 4 As noted above in the section 112(b) rejection of this claim, “the new model … applied to live event stream data” has been interpreted as “the new model … , wherein the new model is applied to live event stream data.” 5 As noted above in the section 112(b) rejection of this claim, “upon validation the second neural network performs more accurately than the baseline neural network” has been interpreted as “wherein, upon validation, the second neural network produces output or results that have an accuracy that is greater than an accuracy of the baseline neural network.” 6 As noted above in the section 112(b) rejection of this claim, “upon validation the second neural network performs more accurately than the baseline neural network” has been interpreted as “wherein, upon validation, the second neural network produces output or results that have an accuracy that is greater than an accuracy of the baseline neural network.” 7 As noted above in the section 112(b) rejection of this claim, “install the new model at the second host system for application to … stream data” has been interpreted as “install the new model at the second host system wherein the new model is for application to … stream data.” 8 As noted above in the section 112(b) rejection of this claim, “install the new model at the second host system for application to live event stream data” has been interpreted as “install the new model at the second host system wherein the new model is for application to live event stream data