×

Natural language processing system for semantic vector representation which accounts for lexical ambiguity

  • US 5,873,056 A
  • Filed: 10/12/1993
  • Issued: 02/16/1999
  • Est. Priority Date: 10/12/1993
  • Status: Expired due to Term
First Claim
Patent Images

1. A method of generating a subject field code vector representation of a document which comprises the steps of assigning subject codes to each of the words of the document which codes express the semantic content of the document, said codes corresponding to the meanings of each of said words in accordance with the various senses thereof;

  • disambiguating said document to select a specific subject code for each of said words heuristically in order first from the occurrence of like codes within each sentence of said documents which occur uniquely and at or with greater than a certain frequency within each sentence, then second correlating the codes for each word with the codes occurring uniquely (unique code) and with greater than or equal to the given frequency in the sentence to select for each word the code having the highest correlation, and then third in accordance with the frequency of usage of the meaning of the word represented by the code; and

    arranging said codes into a weighted vector representing the content of said document.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×