18335478. ADAPTIVE TF-IDF INFERENCE ENGINE (INTERNATIONAL BUSINESS MACHINES CORPORATION)
Contents
ADAPTIVE TF-IDF INFERENCE ENGINE
Organization Name
INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventor(s)
Dominic Rossillo of Highland NY (US)
Michael Terrence Cohoon of Fishkill NY (US)
STEVEN Lafalce of Salt Point NY (US)
James A. O'connor of Ulster Park NY (US)
ADAPTIVE TF-IDF INFERENCE ENGINE
This abstract first appeared for US patent application 18335478 titled 'ADAPTIVE TF-IDF INFERENCE ENGINE
Original Abstract Submitted
In an approach, a processor preprocesses a corpus of documents of a given subject matter by: scanning each document in the corpus to identify stop words, which either is a high occurrence word that appears in at least a first pre-set threshold number of documents or a low occurrence word that appears in less than a second pre-set threshold number of documents; adding the stop words to a list of stop words; performing a spellcheck function on the corpus of documents; scanning each document in the corpus to identify subject matter relevant words based on the given subject matter; adding the identified SMR words to a list of SMR words; and assigning a weight to each identified SMR word based on a term frequency. A processor performs a similarity assessment on the corpus using the list of stop words and the list of SMR words with associated weights.