Cisco Technology, Inc. (20240330365). SYSTEM AND METHOD USING A LARGE LANGUAGE MODEL (LLM) AND/OR REGULAR EXPRESSIONS FOR FEATURE EXTRACTIONS FROM UNSTRUCTURED OR SEMI-STRUCTURED DATA TO GENERATE ONTOLOGICAL GRAPH simplified abstract

From WikiPatents
Revision as of 11:51, 8 October 2024 by Wikipatents (talk | contribs) (Creating a new page)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

SYSTEM AND METHOD USING A LARGE LANGUAGE MODEL (LLM) AND/OR REGULAR EXPRESSIONS FOR FEATURE EXTRACTIONS FROM UNSTRUCTURED OR SEMI-STRUCTURED DATA TO GENERATE ONTOLOGICAL GRAPH

Organization Name

Cisco Technology, Inc.

Inventor(s)

Andrew Zawadowskiy of Hollis NH (US)

Oleg Bessonov of San Jose CA (US)

Vincent Parla of North Hampton NH (US)

SYSTEM AND METHOD USING A LARGE LANGUAGE MODEL (LLM) AND/OR REGULAR EXPRESSIONS FOR FEATURE EXTRACTIONS FROM UNSTRUCTURED OR SEMI-STRUCTURED DATA TO GENERATE ONTOLOGICAL GRAPH - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240330365 titled 'SYSTEM AND METHOD USING A LARGE LANGUAGE MODEL (LLM) AND/OR REGULAR EXPRESSIONS FOR FEATURE EXTRACTIONS FROM UNSTRUCTURED OR SEMI-STRUCTURED DATA TO GENERATE ONTOLOGICAL GRAPH

The abstract describes a system and method for generating a cybersecurity behavioral graph from log files and telemetry data using machine learning models and a cybersecurity ontology.

  • Simplified Explanation:

This patent application describes a method for creating a cybersecurity behavioral graph from unstructured or semi-structured data by applying machine learning models to log files and other telemetry data.

  • Key Features and Innovation:

- Utilizes a machine learning model to extract entities and relationships from log files. - Constrains entities and relationships using a cybersecurity ontology. - Maps extracted entities to nodes and relationships to edges to generate a graph. - Uses a large language model to generate regular expressions for parsing log files efficiently.

  • Potential Applications:

- Enhancing cybersecurity threat detection and analysis. - Improving incident response and forensic investigations. - Streamlining data analysis processes in cybersecurity operations.

  • Problems Solved:

- Efficiently extracting meaningful information from unstructured data. - Enhancing the understanding of cybersecurity events and relationships. - Automating the generation of cybersecurity behavioral graphs.

  • Benefits:

- Improved cybersecurity threat intelligence. - Enhanced visualization of cybersecurity data. - Increased efficiency in cybersecurity analysis and response.

  • Commercial Applications:

- Cybersecurity companies for threat detection and analysis tools. - Government agencies for national security and defense purposes. - Enterprises for protecting sensitive data and networks.

  • Questions about Cybersecurity Behavioral Graphs:

1. How does the use of a cybersecurity ontology improve the relevance of the extracted entities and relationships? 2. What are the potential limitations of using machine learning models for generating cybersecurity behavioral graphs?

  • Frequently Updated Research:

- Stay updated on advancements in machine learning models for cybersecurity data analysis. - Monitor developments in cybersecurity ontologies for improved data interpretation.


Original Abstract Submitted

a system and method are provided for generating a cybersecurity behavioral graph from a log files and/or other telemetry data, which can be unstructured or semi-structured data. the log files are applied to a machine learning (ml) model (e.g., a large language model (llm)) that generates/extract from the log files entities and relationships between said entities. the entities and relationships can be constrained using a cybersecurity ontology or schema to ensure that the results are meaningful to a cybersecurity context. a graph is then generated by mapping the extracted entities to nodes in the graph and the relationships to edges connecting nodes. to more efficiently extract the entities and relationships from the data file, an llm is used to generate regular expressions for the format of the log files. once generated, the regular expressions can rapidly parse the log files to extract the entities and relationships.