×

Word sense disambiguation

  • US 7,024,407 B2
  • Filed: 08/24/2001
  • Issued: 04/04/2006
  • Est. Priority Date: 08/24/2000
  • Status: Expired due to Term
First Claim
Patent Images

1. In a collection of n documents and a reference collection, each document containing terms, the reference collection containing at least one meaning associated with a term, the total number of terms occurring at least once in the document collection equal to at least m, a computer-implemented method for determining a meaning for a sense of a subject term, the subject term found in at least one document and associated with at least one meaning, the computer-implemented method comprising:

  • forming an m by n matrix, where each matrix element (i, j) corresponds to the number of occurrences of term i in document j;

    performing singular value decomposition and dimensionality reduction on the matrix to form a latent semantic indexed vector space;

    determining at least one cluster of documents within the vector space, each cluster corresponding to a subset of the document collection, each member of the subset having at least one occurrence of a subject term;

    discerning an implicit position of a sense of the subject term, each implicit position corresponding to at least one determined cluster;

    discerning at least one non-subject term within the vicinity of the implicit position of the sense; and

    assigning to the sense having a discerned implicit position, the meaning, associated with the term in the reference collection, that correlates best with the discerned non-subject terms closest to the implicit position of the sense.

View all claims
  • 11 Assignments
Timeline View
Assignment View
    ×
    ×