×

Document-based synonym generation

  • US 7,890,521 B1
  • Filed: 02/07/2008
  • Issued: 02/15/2011
  • Est. Priority Date: 02/07/2007
  • Status: Active Grant
First Claim
Patent Images

1. A method for automatically determining whether pairs of words are synonym pairs, comprising:

  • determining co-occurrence frequencies for pairs of words in documents;

    determining closeness scores for pairs of words in the documents, wherein a closeness score indicates whether words in a pair of words are located so close to each other in the documents that the words are likely to occur in the same sentence or phrase;

    generating correlations between (i) words in a title or an anchor of each document in the documents and (ii) words in the document; and

    determining whether each of the pairs of words is a synonym pair based on the determined co-occurrence frequencies, the determined closeness scores, and the correlations, wherein a high closeness score is a negative indicator that a pair of words is a synonym pair, a high co-occurrence frequency is a positive indicator that the pair of words is a synonym pair, and a high correlation is a positive indicator that the pair of words is a synonym pair;

    wherein the determining co-occurrence frequencies, determining closeness scores, generating correlations, and determining whether each of the pairs of words is a synonym pair is performed by a computer system.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×