17942174. HEURISTIC IDENTIFICATION OF SHARED SUBSTRINGS BETWEEN TEXT DOCUMENTS simplified abstract (MICROSOFT TECHNOLOGY LICENSING, LLC)

From WikiPatents
Jump to navigation Jump to search

HEURISTIC IDENTIFICATION OF SHARED SUBSTRINGS BETWEEN TEXT DOCUMENTS

Organization Name

MICROSOFT TECHNOLOGY LICENSING, LLC

Inventor(s)

Mary Elizabeth Wahl of Pasadena CA (US)

Amanda Leah Mercier of Oakton VA (US)

George Taylor Corbett of Rockville MD (US)

HEURISTIC IDENTIFICATION OF SHARED SUBSTRINGS BETWEEN TEXT DOCUMENTS - A simplified explanation of the abstract

This abstract first appeared for US patent application 17942174 titled 'HEURISTIC IDENTIFICATION OF SHARED SUBSTRINGS BETWEEN TEXT DOCUMENTS

Simplified Explanation

The patent application describes technologies for document evaluation and identification of shared textual substrings between documents. Here is a simplified explanation of the abstract:

  • A suffix index is generated from a reference document.
  • The suffix index is used to identify common substrings of text within query documents.
  • Variable evaluation windows within the query documents are used.
  • Indications of overlapping textual information between the reference document and query documents are generated as an output.

Potential Applications

This technology could be applied in plagiarism detection software, document comparison tools, and content management systems.

Problems Solved

This technology helps in efficiently identifying shared textual information between documents, aiding in content analysis and comparison.

Benefits

The technology streamlines the process of document evaluation and comparison, saving time and effort for users. It enhances accuracy in identifying similarities between texts.

Potential Commercial Applications

"Textual Substring Identification Technology: Enhancing Document Comparison and Analysis"

Possible Prior Art

Prior art in this field includes existing document comparison tools, plagiarism detection software, and text analysis algorithms.

Unanswered Questions

How does this technology handle different languages in documents?

The technology's ability to identify shared textual substrings across documents in different languages is not explicitly mentioned in the abstract. It would be interesting to know if the system has language detection capabilities and how it handles multilingual documents.

Can this technology be integrated with existing document management systems?

The abstract does not specify if this technology can be easily integrated with existing document management systems. Understanding the compatibility and integration process with other software solutions would be beneficial for potential users looking to implement this technology.


Original Abstract Submitted

Technologies for document evaluation and identification of shared textual substrings between documents are described herein. Documents are evaluated and organized according to textual elements within the documents. A suffix index is generated from a reference document. The suffix index is used to identify common substrings of text within query documents using variable evaluation windows within the query documents. Indications of overlapping textual information between the reference document and query documents is generated as an output.