US Patent Application 17731593. TRANSLATING LARGE SOURCE CODE USING SPARSE SELF-ATTENTION simplified abstract
Contents
TRANSLATING LARGE SOURCE CODE USING SPARSE SELF-ATTENTION
Organization Name
Inventor(s)
Rishabh Singh of San Jose CA (US)
Manzil Zaheer of Mountain View CA (US)
TRANSLATING LARGE SOURCE CODE USING SPARSE SELF-ATTENTION - A simplified explanation of the abstract
This abstract first appeared for US patent application 17731593 titled 'TRANSLATING LARGE SOURCE CODE USING SPARSE SELF-ATTENTION
Simplified Explanation
- This patent application describes techniques for translating source code using sparse-self attention. - The techniques involve processing a source code snippet to obtain graphs representing snippet tokens and their relationships. - From these graphs, a subset of token pairs is identified, which includes snippet tokens connected by edges in the graphs. - A self-attention network of a translation machine learning model is then adapted to sparsely attend across this subset of token pairs. - The adapted model is used to process the source code snippet and generate a translation in a different programming language.
Original Abstract Submitted
Techniques are described herein for translating source code using sparse-self attention. In various implementations, a source code snippet in a first programming language may be processed to obtain graph(s) representing snippet tokens, and relationships therebetween. Based on the graph(s), a subset of snippet token pairs may be identified from a superset of all possible token pairs in the source code snippet. Each token pair of the subset may include snippet tokens that are represented by nodes connected by one or more edges of the one or more graphs. A self-attention network of a translation machine learning model may be adapted to sparsely attend across the identified subset of token pairs. The source code snippet may then be processed based on the adapted translation machine learning model to generate a translation of the source code snippet in the second programming language.