Patent Application 18464536 - MACHINE LEARNING MODEL BASED RANKING OF

Title: MACHINE LEARNING MODEL BASED RANKING OF GENERATED CODE

Application Information

Invention Title: MACHINE LEARNING MODEL BASED RANKING OF GENERATED CODE
Application Number: 18464536
Submission Date: 2025-05-14T00:00:00.000Z
Effective Filing Date: 2023-09-11T00:00:00.000Z
Filing Date: 2023-09-11T00:00:00.000Z
National Class: 717
National Sub-Class: 104000
Examiner Employee Number: 87882
Art Unit: 2192
Tech Center: 2100

Rejection Summary

102 Rejections: 1
103 Rejections: 2

Cited Patents

The following patents were cited in the rejection:

US 0020116🔗
US 0402999🔗

Office Action Text

DETAILED ACTION
This action is responsive to the application filed on September 11, 2023.
Claims 1-26 are pending and are presented to examination.
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.

Examiner Notes
Examiner cites particular columns, paragraphs, figures and line numbers in the references as applied to the claims below for the convenience of the applicant. Although the specified citations are representative of the teachings in the art and are applied to the specific limitations within the individual claim, other passages and figures may apply as well. It is respectfully requested that, in preparing responses, the applicant fully consider the references in their entirety as potentially teaching all or part of the claimed invention, as well as the context of the passage as taught by the prior art or disclosed by the examiner.
Drawings
The drawings filed on September 11, 2023 are acceptable for examination purposes.

Information Disclosure Statement
As required by M.P.E.P. 609, the applicant’s submission of the Information Disclosure Statements dated September 11, 2023, February 08, 2024, December 09, 2024 and February 27, 2025 are acknowledged by the examiner and the cited references have been considered in the examination of the claims now pending.

Claim Objections
Claims 1-22 are objected to because of the following informalities: Claim 1 (and similar for claim 12) recites the limitation “determining values of features for input to a machine learning model trained to predict values of generated code fragments, wherein the features are metrics of similarity among code fragments in the set of one or more prompts input and the generated code fragments and the metrics of similarity measure at least one of similarity of code fragments and similarity of changes to code fragments;” in lines 6-11. Claim 1 recites the limitation “ranking the generated code fragments based, at least partly, on the predicted values from the machine learning model.” in lines 12-13. Claim 3 (and similar for claim 5) recites “The method of claim 1, wherein…”. Claim 14 recites “The non-transitory, machine-readable medium of claim 12, wherein…”. Please amend the claim language as indicated in Bold. Appropriate correction is required.
Dependent claims 2, 4, 6-11, 13 and 15-22 do not overcome the deficiency of the base claim and, therefore, are objected for the same reasons as the base claim.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 2-3 and 13-14 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention. Claims 2-3 (and similar for claims 13-14) recites “a first code fragment in the prompt…”, however, there is no an additional claim or limitation reciting “a second code fragment in the prompt”. It is noticed claim 3 only recites “a second of the pair of the reference code fragments”, which is not the same. Therefore, is unclear why the applicant uses the term “a first code fragment”.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1, 5-8, 11-12, 16-19 and 22 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Matthew Jin et al. (“InferFix: End-to-End Program Repair with LLMs over Retrieval-Augmented Prompts” – hereinafter Jin – IDS 02/27/2025).
With respect to claim 1, Jin teaches a method comprising: obtaining a plurality of code fragments generated from a generative artificial intelligence (AI) model, wherein the plurality of code fragments corresponds to a set of one or more prompts input into the generative Al model (see “Introduction” section, “In this paper, we introduce InferFix– a program repair framework which combines a transformer encoder model pretrained via contrastive learning serving as a retriever over a database of historic bugs and fixes, and a large language model (12 billion parameter Codex Cushman model, code-cushman-001) instrumented with the facility to leverage retrieved information from the external database. Given the baseline Codex model has been shown to occasionally predict insecure or buggy code [29], we prioritized finetuning it on a bug-free supervised dataset of bugs and fixes with contexts enriched via relevant program repair patterns from an external non-parametric memory. The contributions of the paper are as follows: (i) we propose a program repair framework that leverages static analyses for bug detection, localization, and categorization paired with a large language model finetuned for program repair task on a dataset of augmented prompts, (ii) we curate InferredBugs: a metadata-rich dataset of bugs and fixes in Java and C# programming languages extracted with the Infer static analyzer, (iii) we introduce a dedicated prompt augmentation technique for program repair task, which leverages dense retrieval from an external database of historic bugs and fixes, bug type annotations, and syntactic hierarchies across the entire source code file affected by a bug, (iv) we evaluate our model on the InferredBugs dataset, achieving an impressive 76% top-1 accuracy of patch generation in Java, and over 65% in C#, across null pointer dereference, resource
leak, and thread safety violation bug types, and finally (v) we deploy InferFix as a GitHub action and as part of the Azure DevOps continuous integration pipeline internally at Microsoft, and document aspects of deployment.” See “6.1 Basic Prompt” section, first paragraph, “The most basic prompt we can construct for the model is to provide the buggy method as input while expecting the model to generate the fix by outputting the fixed version of the given method. Thus, we perform task-oriented finetuning of our Generator model (Codex) using the buggy and fixed versions of the methods from the InferredBugs dataset”. Furthermore, see figures 2-4 and sections “4.3 Instruction Prompting”, “6 Prompt Augmentation”). determining values of features for input to a machine learning model trained to predict values of generated code fragments, wherein the features are metrics of similarity among code fragments in the set of prompts and the generated code fragments and the metrics of similarity measure at least one of similarity of code fragments and similarity of changes to code fragments (see abstract, “Large language models have been adapted to the program repair task through few-shot demonstration learning and instruction prompting, treating this as an infilling task. However, these models have only focused on learning general bug-fixing patterns for uncategorized bugs mined from public repositories. In this paper, we propose InferFix: a transformer-based program repair framework paired with a state-of-the-art static analyzer to fix critical security and performance bugs. InferFix combines a Retriever – transformer encoder model pretrained via contrastive learning objective, which aims at searching for semantically equivalent bugs and corresponding fixes; and a Generator – a large language model (12 billion parameter Codex Cushman model) finetuned on supervised bug-fix data with prompts augmented via adding bug type annotations and semantically similar fixes retrieved from an external non-parametric memory”. See “Introduction” section, “In our approach we combine the benefits of both paradigms, by augmenting the prompts and then finetuning our model on the dataset of augmented prompts and predictions to get the best performance”. See “2 Motivating Example” section and figure 2, “The predicted patch is then validated by executing the Infer static analyzer and unit tests as part of the continuous integration pipeline to ensure the error is indeed fixed and no regressions are introduced in the code base”. See “4.2 Conditional Language Modeling” section, “Our next baseline is the zero-shot conditional language generation (code completion), which aims to utilize the next token prediction to repair programs. Specifically, given a bug-free prefix, we run Codex model inference to complete the buggy code snippet, aiming to rewrite a program without bugs. In our experiments, we apply nucleus sampling decoding algorithm with 𝑡𝑜𝑝_𝑝 = 1 and a temperature 𝑇 = 0.7 generating top 10 samples up to the length of 1024 tokens with a total length for prefix and completion of 2048. Our conditional language modeling experiments are also based on the code-cushman-001.”. See “6.6. Inference” section, “The Inference step for InferFix involves utilizing nucleus sampling decoding with a top_p parameter of 1.0 and a temperature of 0.7. During this step, the tool decodes the top-10 best predictions generated by the large language model, and ranks them according to their sequence log probabilities. This ranking helps to ensure that the most likely and relevant fixes are presented to the user. The use of nucleus sampling decoding, with its specific top_p and temperature parameters, helps to balance the trade-off between diversity and quality in the generated predictions, making it possible to obtain highly accurate and diverse patch candidates.”. Furthermore, see “5 Inferfix Framework”, “5.2 Retrieval Module”, “5.3 Generator Module”, “6 Prompt Augmentation”, “6.5 Enriching Context with Hints”, “Results” sections and figure 5) and ranking the generated code fragments based, at least partly, on the predicted values output from the machine learning model (see the rejection above and “6.6 Inference” section, “The Inference step for InferFix involves utilizing nucleus sampling decoding with a top_p parameter of 1.0 and a temperature of 0.7. During this step, the tool decodes the top-10 best predictions generated by the large language model, and ranks them according to their sequence log probabilities. This ranking helps to ensure that the most likely and relevant fixes are presented to the user. The use of nucleus sampling decoding, with its specific top_p and temperature parameters, helps to balance the trade-off between diversity and quality in the generated predictions, making it possible to obtain highly accurate and diverse patch candidates”). With respect to claim 5, Jin teaches further comprising, for each prompt, generating a code structure signature for each code fragment in the prompt and for the corresponding one of the generated code fragments, wherein determining the values of features is based, at least in part, on the code structure signatures (see “6.4 eWASH extended context” section and figure 5 “A source code file may have nested scopes and references to other external libraries or other files. To accurately suggest patches a model must leverage knowledge across different parts of the file. The length of source code files will often exceed the fixed-length window of transformer models (2048 tokens in our case), which could potentially lead to a loss of information relevant for learning to repair programs. To overcome this limitation, we utilize eWASH [8] to prioritize syntax hierarchies which are most relevant to the buggy snippet region. Extracting syntactic hierarchies from the entire source code files, as opposed to the tokens immediately preceding the bug location, we are able to retain most relevant code context, such as class-level fields and method arguments, and peer methods which are highly relevant to program repair. Starting with a concrete syntax tree of a source file, we organize and prioritize class-level and method-level syntactic elements such as global import statements and assigned values, class attributes, method signatures, class docstring, and global expressions in the input.”. Examiner notes: abstract representation (i.e., signature)). With respect to claim 6, Jin teaches wherein generating a code structure signature comprises generating a representation of a code fragment without variability of names (see figure 5). With respect to claim 7, Jin teaches wherein generating a code structure signature comprises generating a representation of a code fragment that replaces each identifier name with a representative token for identifiers and each variable name with a representative token for variables (see “6.5 Enriching Context with Hints” section “To focus on extracting structurally similar fixes and reduce the dependency on identifier naming we obfuscate code snippets serving as keys in the database and search queries. Namely, we parse and analyze the code identifier types and mask the names of classes, methods, and identifiers with placeholder symbols: CLASS_NN, METHOD_NN, and VAR_NN, where NN is a unique number. An example obfuscated representation is shown in Figure 6.”). With respect to claim 8, Jin teaches wherein determining the values of features based, at least in part, on the code structure signatures comprises calculating values for a subset of the similarity metrics of code fragments as represented by the code structure signatures (see abstract, “Large language models have been adapted to the program repair task through few-shot demonstration learning and instruction prompting, treating this as an infilling task. However, these models have only focused on learning general bug-fixing patterns for uncategorized bugs mined from public repositories. In this paper, we propose InferFix: a transformer-based program repair framework paired with a state-of-the-art static analyzer to fix critical security and performance bugs. InferFix combines a Retriever – transformer encoder model pretrained via contrastive learning objective, which aims at searching for semantically equivalent bugs and corresponding fixes; and a Generator – a large language model (12 billion parameter Codex Cushman model) finetuned on supervised bug-fix data with prompts augmented via adding bug type annotations and semantically similar fixes retrieved from an external non-parametric memory”. See “Introduction” section, “In our approach we combine the benefits of both paradigms, by augmenting the prompts and then finetuning our model on the dataset of augmented prompts and predictions to get the best performance”. See “2 Motivating Example” section and figure 2, “The predicted patch is then validated by executing the Infer static analyzer and unit tests as part of the continuous integration pipeline to ensure the error is indeed fixed and no regressions are introduced in the code base”. See “4.2 Conditional Language Modeling” section, “Our next baseline is the zero-shot conditional language generation (code completion), which aims to utilize the next token prediction to repair programs. Specifically, given a bug-free prefix, we run Codex model inference to complete the buggy code snippet, aiming to rewrite a program without bugs. In our experiments, we apply nucleus sampling decoding algorithm with 𝑡𝑜𝑝_𝑝 = 1 and a temperature 𝑇 = 0.7 generating top 10 samples up to the length of 1024 tokens with a total length for prefix and completion of 2048. Our conditional language modeling experiments are also based on the code-cushman-001.”. See “6.6. Inference” section, “The Inference step for InferFix involves utilizing nucleus sampling decoding with a top_p parameter of 1.0 and a temperature of 0.7. During this step, the tool decodes the top-10 best predictions generated by the large language model, and ranks them according to their sequence log probabilities. This ranking helps to ensure that the most likely and relevant fixes are presented to the user. The use of nucleus sampling decoding, with its specific top_p and temperature parameters, helps to balance the trade-off between diversity and quality in the generated predictions, making it possible to obtain highly accurate and diverse patch candidates.”. Furthermore, see “5 Inferfix Framework”, “5.2 Retrieval Module”, “5.3 Generator Module”, “6 Prompt Augmentation”, “6.5 Enriching Context with Hints”, “Results” sections and figure 5). With respect to claim 11, Jin teaches wherein the generative Al model is a language model with a transformer architecture (see “4.3 Instruction prompting” section, “Instruction learning is a prompt augmentation technique that introduces a natural language description of the task. To approach program repair, we prepare prompts following a template: We utilize OpenaAI GPT-3 Davinci model, a 175 billion parameter language model and a close sibling of ChatGPT, to complete the prompts”).
With respect to claims 12, 16-19 and 22, the claims are directed to a non-transitory, machine-readable medium that corresponds to the method recited in claims 1, 5-8 and 11, respectively (see the rejection of claims 1, 5-8 and 11 above; wherein Jin uses a system to implement to use of the tool).
respectively (see the rejection of claims 1, 5-8 and 11 above; wherein Jin uses a system to implement to use of the tool).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 9 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Matthew Jin et al. (“InferFix: End-to-End Program Repair with LLMs over Retrieval-Augmented Prompts”) in view of Azad et al. (US Pat. No. 11,645,188 – hereinafter Azad). With respect to claim 9, Jin is silent to disclose, however, in an analogous art, Azad teaches wherein the machine learning model is an ensemble of weak prediction models (see column 6 lines 6-18, “File risk prediction model 108 extracts file features from one or more files included in a commit that are labelled as risky and applies a statistical model to the extracted features to output a file risk assessment. File features may include, but are not limited to, a number of commits that include the file, a churn value, e.g., a number of lines added and/or subtracted, a number of changesets, and a togetherness score. In an embodiment, file risk prediction model 108 is a weak supervised machine learning model, i.e., where noisy, limited, or imprecise sources are used to provide supervision signal for labeling large amounts of training data in a supervised learning setting.”). Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Jin’s teaching, which train and evaluate a source code to curate/fix bugs, by using machine learning model an ensemble of weak prediction models as suggested by Azad, as Azad would provide a mechanism to predict the riskiness of merging a pull request by identifying risky changes early in a continuous integration/continuous deployment (CI/CD) environment (see column 2 lines 22-25). With respect to claim 20, the claim is directed to a non-transitory, machine-readable medium that corresponds to the method recited in claim 9, respectively (see the rejection of claim 9 above).

Claims 10 and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Matthew Jin et al. (“InferFix: End-to-End Program Repair with LLMs over Retrieval-Augmented Prompts”) in view of Chen et al. (US Pub. No. 2024/0402999 – hereinafter Chen). With respect to claim 10, Jin is silent to disclose, however, in an analogous art, Chen teaches wherein the machine learning model is one or more regression models (see paragraph [0052], “As an example, a trained machine learning model may include a linear regression model, a decision tree model, a random forest model, a support vector machine model, a convolutional neural network, a recurrent neural network, or another artificial intelligence model, such as those discussed with respect to FIG. 8.”). Therefore, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to modify Jin’s teaching, which train and evaluate a source code to curate/fix bugs, by using wherein the machine learning model is one or more regression models as suggested by Chen, as Chen would enhanced the process of interaction of generating natural language based on computer code input and vice versa (see paragraph [0002]). With respect to claim 21, the claim is directed to a non-transitory, machine-readable medium that corresponds to the method recited in claim 10, respectively (see the rejection of claim 10 above).

Allowable Subject Matter
Claims 2-4 and 13-15 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
After sufficient search and analysis, Examiner concluded that the claimed invention has been recited in such a manner that independent claim 23 is not taught by any prior reference found through search. The primary reason for allowance of the claims in this case, is the inclusion of the limitations “preprocess the training dataset, wherein the instructions to preprocess the training dataset comprise instructions to, calculate a first set of values of similarity metrics between each of the expected code fragments and each of a corresponding subset of the generated code fragments, for each of the generated code fragments, calculate a quality measurement based, at least in part, on the first set of values of similarity metrics of the generated code fragment and the corresponding expected code fragment and for each of the prompts, calculate a second set of values of similarity metrics among parts of the prompt and a subset of the generated code fragments generated responsive to the prompt and train a machine learning model to predict quality measurements for generated code fragments based on the second set of values as inputs and the calculated quality measurements as targets.”, which are not found in the prior art of record. Based on prior art references and further search, Examiner has concluded that these details are not found in the prior art of record and would not have been obvious, thus claim 23 is allowed.
Claims 24-26 depend on claim 23 and are also allowable.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Chen et al. (US Pub. No. 2024/0020116) discloses a method for generating natural language based on computer code input. In an embodiment, a method may comprise one or more of: accessing a docstring generation model configured to generate docstrings from computer code; receiving one or more computer code samples; generating, using the docstring generation model and based on the received one or more computer code samples, one or more candidate docstrings representing natural language text, each of the one or more candidate docstrings being associated with at least a portion of the one or more computer code samples; identifying at least one of the one or more candidate docstrings that provides an intent of the at least a portion of the one or more computer code samples; and/or outputting, via a user interface, the at least one identified docstring with the at least a portion of the one or more computer code samples (see abstract).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ANIBAL RIVERA whose telephone number is (571)270-1200. The examiner can normally be reached Monday-Friday 9:30 AM-6:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hyung S Sough can be reached on 5712726799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/ANIBAL RIVERA/Primary Examiner, Art Unit 2192