Oracle international corporation (20240127004). MULTI-LINGUAL NATURAL LANGUAGE GENERATION simplified abstract
Contents
- 1 MULTI-LINGUAL NATURAL LANGUAGE GENERATION
- 1.1 Organization Name
- 1.2 Inventor(s)
- 1.3 MULTI-LINGUAL NATURAL LANGUAGE GENERATION - A simplified explanation of the abstract
- 1.4 Simplified Explanation
- 1.5 Potential Applications
- 1.6 Problems Solved
- 1.7 Benefits
- 1.8 Potential Commercial Applications
- 1.9 Possible Prior Art
- 1.10 Original Abstract Submitted
MULTI-LINGUAL NATURAL LANGUAGE GENERATION
Organization Name
oracle international corporation
Inventor(s)
Praneet Pabolu of Bangalore (IN)
Sriram Chaudhury of Bangalore (IN)
MULTI-LINGUAL NATURAL LANGUAGE GENERATION - A simplified explanation of the abstract
This abstract first appeared for US patent application 20240127004 titled 'MULTI-LINGUAL NATURAL LANGUAGE GENERATION
Simplified Explanation
The patent application describes a method for extracting keywords from articles in a target language using a machine learning model and generating a dataset of keyword-text pairs for further analysis.
- Obtaining article-summary pairs in multiple languages from a text corpus.
- Inputting articles into a machine learning model to generate embeddings for sentences.
- Extracting keywords from the articles based on sentence lengths.
- Outputting the extracted keywords.
- Applying a maximal marginal relevance algorithm to select relevant keywords.
- Generating a dataset of keyword-text pairs with relevant keywords and corresponding text.
Potential Applications
This technology could be applied in various fields such as natural language processing, information retrieval, and content analysis.
Problems Solved
This technology helps in automating the process of keyword extraction from articles, which can be time-consuming and labor-intensive when done manually.
Benefits
The benefits of this technology include improved efficiency in analyzing large amounts of text data, better organization of information, and enhanced search capabilities.
Potential Commercial Applications
One potential commercial application of this technology could be in developing tools for content creators, marketers, and researchers to optimize their content for search engines and improve information retrieval.
Possible Prior Art
One possible prior art could be existing keyword extraction algorithms used in natural language processing and information retrieval systems.
Original Abstract Submitted
a computer-implemented method includes obtaining, from text corpus including article-summary pairs in a plurality of languages, a plurality of article-summary pairs in a target language among the plurality of languages, to form an article-summary pairs dataset in which each article corresponds to a summary; inputting articles from the article-summary pairs to a machine learning model; generating, by the machine learning model, embeddings for sentences of the articles; extracting, by the machine learning model, keywords from the articles with a probability that varies based on lengths of the sentences, respectively; outputting, by the machine learning model, the keywords; applying a maximal marginal relevance algorithm to the extracted keywords, to select relevant keywords; and generating a keyword-text pairs dataset that includes the relevant keywords and text from the articles, the text corresponding to the relevant keywords in each of keyword-text pairs of the keyword-text pairs dataset.