×

Document information retrieval using global word co-occurrence patterns

DC
  • US 5,675,819 A
  • Filed: 06/16/1994
  • Issued: 10/07/1997
  • Est. Priority Date: 06/16/1994
  • Status: Expired due to Term
First Claim
Patent Images

1. A method, using a processor and memory, for generating a thesaurus of word vectors based on lexical co-occurrence of words within documents of a corpus of documents, the corpus stored in the memory, the method comprising:

  • retrieving into the processor a retrieved word from the corpus;

    recording a number of times a co-occuring word co-occurs in a same document within a predetermined range of the retrieved word;

    repeating the recording step for every co-occurring word located within the predetermined range for each occurrence of the retrieved word in the corpus;

    generating a word vector for the word based on every recorded number;

    repeating the retrieving, recording, recording repeating and generating steps for each word in the corpus, andstoring the generated word vectors in the memory as the thesaurus.

View all claims
  • 9 Assignments
Timeline View
Assignment View
    ×
    ×