Patent Application 17501003 - METHOD DEVICE AND STORAGE MEDIUM FOR TRAINING A - Rejection
Appearance
Patent Application 17501003 - METHOD DEVICE AND STORAGE MEDIUM FOR TRAINING A
Title: METHOD, DEVICE AND STORAGE MEDIUM FOR TRAINING A DEEP LEARNING FRAMEWORK
Application Information
- Invention Title: METHOD, DEVICE AND STORAGE MEDIUM FOR TRAINING A DEEP LEARNING FRAMEWORK
- Application Number: 17501003
- Submission Date: 2025-05-22T00:00:00.000Z
- Effective Filing Date: 2021-10-14T00:00:00.000Z
- Filing Date: 2021-10-14T00:00:00.000Z
- National Class: 706
- National Sub-Class: 012000
- Examiner Employee Number: 83246
- Art Unit: 2145
- Tech Center: 2100
Rejection Summary
- 102 Rejections: 0
- 103 Rejections: 2
Cited Patents
The following patents were cited in the rejection:
Office Action Text
Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . This action is in response to the amendment filed 20 March 2025. Claims 1, 8, 15 have been amended. Claims 7 and 14 have been canceled. Claims 1-6, 8-13, 15-20 are pending and have been considered below. Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention. Claim(s) 1-3, 5, 7-10, 12, 14-17, 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Liao et al. (US 11,514,309 B2) in view of Hollister (US 12,050,577 B1) and further in view of Manamohan et al (US 2020/0031583 A1). Claim 1. Liao discloses a method for training a deep learning framework, comprising: acquiring at least one task node in a current task node cluster, a method and apparatus for accelerating distributed training of a deep neural network (C 1 L 59-61) for performing at least one task (C 2 L 10-15) the scheduling of tasks are mapped into a directed graph model (C 3 L 16-17) comprising task nodes (C 3 L 25-26), synchronously training the deep learning framework of the target task, training progresses of parallel subnetworks are synchronized to accelerate the distributed training of the deep neural network (C 2 L 11-13), by the at least one task node according to sample data, training of parallel subnetworks, with subsets samples, are synchronized to accelerate the distributed training of the deep neural network and data is localized to perform a task at a preset node to minimize data transmission time (C 4 L 35-46); and acquiring a synchronously trained target deep learning framework when the target task meets a training completion condition, transforming the directed graph model into a residual graph (C 3 L 18) to complete the training of the deep neural network (C 13 L 27-30) determining whether the potential of the node object is positive (C 13 L 38-40 Fig 4 Step 402) updating the parameters of the residual graph if there are no more unscheduled tasks (C 15 L 3-4 Fig 4) The neural network, represented by a directed graph model and transformed into a residual graph, is asynchronously trained, a completion condition is determined, and upon completion, the residual graph is updated; wherein when the target task meets the training start condition, a method for distributed training of a deep neural network (C 1 L 59-64) wherein a task is performed at a preset cloud resource node to minimize data transmission time (C 2 L 13-15) in a distributed cluster architecture that comprises multiple cloud resource nodes executing multiple applications, each application comprising multiple tasks, each task configured for training a subnetwork (C 2 L 16-22) a task may be initiated on a cloud resource computing node that is idle (C 10 L 48-49) tasks train the neural network (C 10 L 53-54) The training start condition is that a resource node to be trained is idle, the method comprises: monitoring a priority level of a task to be executed in the current task node cluster, a positive potential indicates that the node object has assignable tasks, the number of the assignable tasks being equal to the positive potential; a negative potential indicates that the number of tasks that have been assigned to the node object exceeds the number of maximum assignable tasks of the node object, and the number of excessive tasks is the absolute value of the negative potential (C 3 L 37-44) determining that there unscheduled tasks (C 3 L 61-63) and determining that a task has not been initiated and a task that has already been initiated (C 11 L 1-7) The examiner is interpreting the tasks that have yet to be scheduled as having a priority level different from tasks that had been previously scheduled and initiated; and determining that the target task meets the training start condition when the priority level is less than a preset level, determining whether there are currently unscheduled tasks (C 3 L 61-62) assigning an unscheduled task to a node (C 4 L 13-24) determining that a task as already been initiated (C 11 L 1-7) The examiner is interpreting the unscheduled tasks as having a lower priority than the prior scheduled tasks and initiated tasks, which are determined through the disclosed process of Liao how to be assigned to nodes. Liao does not disclose … that meets a preset opening condition when a target task meets a training start condition, as disclosed in the claims. However, Liao discloses if a node is idle, a task may be immediately initiated (C 10 L 48-51). Liao discloses conditions for initiating execution of a task, but does not disclose initiating training. In the same field of invention, Hollister discloses machine learning training is based on dynamic event tree and updates (C 7 L 63 – C 8 L 5) the start event tree node listens for the one or more conditions to start the training process (C 8 L 15-21). Therefore, considering the teachings of Liao and Hollister, one having ordinary skill in the art before the effective filing date of the invention would have been motivated to combine … that meets a preset opening condition when a target task meets a training start condition with the teachings of Liao with the motivation to ensure that training is initiated at the most appropriate time so that training for the desired conditions will begin when all the desired conditions are met. Liao does not disclose judging whether a number of nodes of the at least one task node is greater than a preset number; synchronously training the deep learning framework of the target task … when the number of nodes is greater than the preset number, as disclosed in the claims. However, Liao discloses calculating a minimum number of tasks to perform to continue with the training (C 3 L 65 – C 4 L 24 Fig 4). Liao discloses determining a minimum number of tasks for continuation of training, but not the minimum number of nodes. In the same field of invention, Manamohan discloses fault tolerance may require a specified number of nodes to be registered for an iteration of training, which translates to a minimum number of nodes that may be required to be actively present in the population of participant nodes after which each node may obtain a local training dataset that is accessible locally but not accessible at other computing nodes and train a first local model based on the local training dataset during the first iteration and obtain at least a first shared training parameter based on the first local model (P 0033). Therefore, considering the teachings of Liao, Hollister and Manomohan, one having ordinary skill in the art before the effective filing date of the invention would have been motivated to combine judging whether a number of nodes of the at least one task node is greater than a preset number; synchronously training the deep learning framework of the target task … when the number of nodes is greater than the preset number with the teachings of Liao and Hollister with the motivation to ensure that the minimum number of nodes necessary to complete a task so that the appropriate number of nodes may be trained for the task. Claim 2. Liao, Hollister and Manomohan disclose the method according to claim 1, and the combination of Liao in view of Hollister discloses wherein the acquiring the at least one task node in the current task node cluster, that meets the preset opening condition, comprises: determining a node state of each node in the current task node cluster; and determining a node, the node state of which is an idle state condition, as the at least one task node that meets the preset opening condition, Liao: if a node is idle, a task may be immediately initiated (C 10 L 48-51), Hollister machine learning training is based on dynamic event tree and updates (C 7 L 63 – C 8 L 5) the start event tree node listens for the one or more conditions to start the training process (C 8 L 15-21) as in the rejection of Claim 1. Claim 3. Liao, Hollister and Manomohan disclose the method according to claim 1, and the combination of Liao in view of Hollister discloses wherein the acquiring the at least one task node in the current task node cluster, that meets the preset opening condition, comprises: determining an amount of idle resources of each node in the current task node cluster; and determining a node, the amount of idle resources of which is greater than a preset threshold condition, as the at least one task node that meets the preset opening condition, Liao: a set of resource nodes is determined and the corresponding remaining run-time of an application, a number of tasks, the elapsed run-time of the application, the running time progress, the estimated minimum data transmission time of the task the waiting time till the resource of the node becomes idle, amount of data stored by the task (C 2 L 59 - C 3 L 6) determining the node object includes determining a present minimum capacity (C 4 L 16-24) in order to minimize the training time of the deep neural network, it is necessary to find the minimum sum of the remaining run-time and data transmission time of all applications (C 9 L 60- 67) if a node is idle, a task may be immediately initiated (C 10 L 48-51), Hollister machine learning training is based on dynamic event tree and updates (C 7 L 63 – C 8 L 5) the start event tree node listens for the one or more conditions to start the training process (C 8 L 15-21) as in the rejection of Claim 1. Claim 5. Liao, Hollister and Manomohan disclose the method according to claim 1, and Liao discloses wherein the synchronously training the deep learning framework of the target task by the at least one task node according to the sample data comprises: monitoring whether the current task node cluster contains other task nodes that meet the preset opening condition, a set of resource nodes is determined and the corresponding remaining run-time of an application, a number of tasks, the elapsed run-time of the application, the running time progress, the estimated minimum data transmission time of the task the waiting time till the resource of the node becomes idle, amount of data stored by the task (C 2 L 59 - C 3 L 6) if a potential of a node is positive (C 3 L 67 – C 4 L 2) the node object is added into a predefined set, a total number of currently unscheduled tasks is calculated in the predefined set as a first number, and a number of tasks is calculated that can be assigned at a minimum cost as a second number (C 4 L 3-8) determining the node object includes determining a present minimum capacity (C 4 L 16-24) if a node is idle, a task may be immediately initiated (C 10 L 48-51); and synchronously training the deep learning framework of the target task by the other task nodes and the at least one task node according to the sample data when the other task nodes exist, training of parallel subnetworks, with subsets samples, are synchronized to accelerate the distributed training of the deep neural network and data is localized to perform a task at a preset node to minimize data transmission time (C 4 L 35-46). Claim 7. Canceled. Claim(s) 8, 9, 10, 12 is/are directed to electronic device claim(s) similar to the method claim(s) of Claim(s) 1, 2, 3, 5 and is/are rejected with the same rationale. Claim 14. Canceled. Claim(s) 15, 16, 17, 19 is/are directed to electronic device claim(s) similar to the method claim(s) of Claim(s) 1, 2, 3, 5 and is/are rejected with the same rationale. Claim(s) 4, 6, 11, 13, 18, 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Liao et al. (US 11,514,309 B2) and Hollister (US 12,050,577 B1) and Manamohan et al (US 2020/0031583 A1) and further in view of Bruckhaus et al. (US 8,417,715 B1). Claim 4. Liao, Hollister and Manomohan disclose the method according to claim 1, but Liao does not disclose wherein the synchronously training the deep learning framework of the target task by the at least one task node according to the sample data comprises: training the deep learning framework in each task node; reading framework parameters of the deep learning framework in each task node in each period according to a preset period; determining a first average value, wherein the first average value is an average value of the framework parameters of all task nodes; and synchronizing the deep learning framework in each task node according to the first average value, as disclosed in the claims. Liao does not disclose judging whether a number of nodes of the at least one task node is greater than a preset number; synchronously training the deep learning framework of the target task … when the number of nodes is greater than the preset number, as disclosed in the claims. However, Liao discloses a set of resource nodes is determined and the corresponding remaining run-time of an application, a number of tasks, the elapsed run-time of the application, the running time progress, the estimated minimum data transmission time of the task the waiting time till the resource of the node becomes idle, amount of data stored by the task (C 2 L 59 - C 3 L 6) determining the node object includes determining a present minimum capacity (C 4 L 16-24) acquiring a set of training samples (C 4 L 32-33). In the same field of invention, Bruckhaus discloses the objective function may also consider the amount of CPU cycles, memory, disk I/O, and other resources the algorithm has consumed to build the model, as well as the resources required to score data, such as the training data and scoring data, and the average resources required per training record and scoring record (C 47 L 8-13) a model manager reads models and their performance data from the data mining repository, selects the model with the best performance characteristics, including objective function and cost (C 57 L 42-46). Therefore, considering the teachings of Liao, Hollister, Manomohan and Bruckhaus, one having ordinary skill in the art before the effective filing date of the invention would have been motivated to combine wherein the synchronously training the deep learning framework of the target task by the at least one task node according to the sample data comprises: training the deep learning framework in each task node; reading framework parameters of the deep learning framework in each task node in each period according to a preset period; determining a first average value, wherein the first average value is an average value of the framework parameters of all task nodes; and synchronizing the deep learning framework in each task node according to the first average value, as disclosed in the claims. Liao does not disclose judging whether a number of nodes of the at least one task node is greater than a preset number; synchronously training the deep learning framework of the target task … when the number of nodes is greater than the preset number with the teachings of Liao, Hollister and Manomohan with the motivation to provide a better method for automated data analytic solutions for standard and/or custom platforms improved analytic models over time (Bruckhaus: C 3 L 45-54). Claim 6. Liao, Hollister and Manomohan disclose the method according to claim 5, but Liao does not disclose wherein the synchronously training the deep learning framework of the target task by the other task nodes and the at least one task node according to the sample data comprises: acquiring current framework parameters of the deep learning framework in each task node of the at least one task node; determining a second average value, wherein the second average value is an average value of all the current framework parameters; and updating the framework parameters of the deep learning framework by the other task nodes and the at least one task node according to the second average value, as disclosed in the claims. However, Liao discloses a set of resource nodes is determined and the corresponding remaining run-time of an application, a number of tasks, the elapsed run-time of the application, the running time progress, the estimated minimum data transmission time of the task the waiting time till the resource of the node becomes idle, amount of data stored by the task (C 2 L 59 - C 3 L 6) determining the node object includes determining a present minimum capacity (C 4 L 16-24) acquiring a set of training samples (C 4 L 32-33). In the same field of invention, Bruckhaus discloses if a potential of a node is positive (C 3 L 67 – C 4 L 2) the node object is added into a predefined set, a total number of currently unscheduled tasks is calculated in the predefined set as a first number, and a number of tasks is calculated that can be assigned at a minimum cost as a second number (C 4 L 3-8) the objective function may also consider the amount of CPU cycles, memory, disk I/O, and other resources the algorithm has consumed to build the model, as well as the resources required to score data, such as the training data and scoring data, and the average resources required per training record and scoring record (C 47 L 8-13) a model manager reads models and their performance data from the data mining repository, selects the model with the best performance characteristics, including objective function and cost (C 57 L 42-46). Therefore, considering the teachings of Liao, Hollister, Manomohan and Bruckhaus, one having ordinary skill in the art before the effective filing date of the invention would have been motivated to combine wherein the synchronously training the deep learning framework of the target task by the other task nodes and the at least one task node according to the sample data comprises: acquiring current framework parameters of the deep learning framework in each task node of the at least one task node; determining a second average value, wherein the second average value is an average value of all the current framework parameters; and updating the framework parameters of the deep learning framework by the other task nodes and the at least one task node according to the second average value with the teachings of Liao, Hollister and Manomohan with the motivation to provide a better method for automated data analytic solutions for standard and/or custom platforms improved analytic models over time (Bruckhaus: C 3 L 45-54). Claim(s) 11, 13 is/are directed to electronic device claim(s) similar to the method claim(s) of Claim(s) 4, 6 and is/are rejected with the same rationale. Claim(s) 18, 20 is/are directed to electronic device claim(s) similar to the method claim(s) of Claim(s) 4, 6 and is/are rejected with the same rationale. Response to Arguments Applicant's arguments filed 20 March 2025 have been fully considered but they are not persuasive. The applicant argues: It is thus clear that [Liao] does not involve technical features such as the priority level defined by amended claim 1. Therefore, Liao does not disclose or teach the underlined technical features, including monitoring a priority level of a task to be executed in the current task node cluster, and determining that the target task meets the training start condition when the priority level is less than a preset level. Based on the above, the cited reference also fails to teach the features required by amended claim 1. Therefore, amended claim | is believed to be patentable over the cited reference. For at least these reasons above, all cited references, individually or in combination, fail to teach or suggest all features recited in amended claim 1. Independent claims 8 and 15 are patentable over the cited references for similar reasons. The examiner respectfully disagrees. Liao discloses a method for distributed training of a deep neural network (C 1 L 59-64) wherein a task is performed at a preset cloud resource node to minimize data transmission time (C 2 L 13-15) in a distributed cluster architecture that comprises multiple cloud resource nodes executing multiple applications, each application comprising multiple tasks, each task configured for training a subnetwork (C 2 L 16-22) a task may be initiated on a cloud resource computing node that is idle (C 10 L 48-49) tasks train the neural network (C 10 L 53-54). The training start condition is that a resource node to be trained is idle. Furthermore Liao a positive potential indicates that the node object has assignable tasks, the number of the assignable tasks being equal to the positive potential; a negative potential indicates that the number of tasks that have been assigned to the node object exceeds the number of maximum assignable tasks of the node object, and the number of excessive tasks is the absolute value of the negative potential (C 3 L 37-44) determining that there unscheduled tasks (C 3 L 61-63) and determining that a task has not been initiated and a task that has already been initiated (C 11 L 1-7). The examiner is interpreting the tasks that have yet to be scheduled as having a priority level different from tasks that had been previously scheduled and initiated. Liao discloses determining whether there are currently unscheduled tasks (C 3 L 61-62) assigning an unscheduled task to a node (C 4 L 13-24) determining that a task as already been initiated (C 11 L 1-7). The examiner is interpreting the unscheduled tasks as having a lower priority than the prior scheduled tasks and initiated tasks, which are determined through the disclosed process of Liao how to be assigned to nodes. Conclusion THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. Any inquiry concerning this communication should be directed to JOHN M HEFFINGTON at telephone number (571)270-1696. Examiner interviews are available via a variety of formats. See MPEP § 713.01. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOHN M HEFFINGTON whose telephone number is (571)270-1696. The examiner can normally be reached on Monday through Friday from 9:30 am to 5:30 pm Eastern. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Cesar B Paula, can be reached at telephone number 571-272-4128. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of an application may be obtained from Patent Center. Status information for published applications may be obtained from Patent Center. Status information for unpublished applications is available through Patent Center to authorized users only. Should you have questions about access to the USPTO patent electronic filing system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). Examiner interviews are available via a variety of formats. See MPEP § 713.01. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) Form at https://www.uspto.gov/InterviewPractice. /J.M.H/Examiner, Art Unit 2145 5/16/2025 /CESAR B PAULA/Supervisory Patent Examiner, Art Unit 2145