×

Methods and systems for identifying paraphrases from an index of information items and associated sentence fragments

  • US 8,280,893 B1
  • Filed: 05/02/2011
  • Issued: 10/02/2012
  • Est. Priority Date: 03/23/2005
  • Status: Expired due to Fees
First Claim
Patent Images

1. A system comprising:

  • one or more computers programmed to implement a paraphrase engine to create an index of paraphrases, paraphrases being groups of one or more words in a same language, the groups having a same or a similar meaning but not being identical, the paraphrase engine to perform operations including;

    identifying a first sentence fragment and a second sentence fragment that, in text of one or more electronic documents, are both associated with a same date or entity name, the first sentence fragment and the second sentence fragment each comprising two or more like-words found in both the first sentence fragment and the second sentence fragment and one or more dissimilar-words found in only one of the first sentence fragment or the second sentence fragment;

    aligning the like-words in the first sentence fragment with the like-words in the second sentence fragment;

    determining that the alignment satisfies a threshold frequency value; and

    in response to determining that the alignment satisfies the threshold frequency value, extracting the dissimilar-words from the first sentence fragment and the dissimilar-words from the second sentence fragment; and

    outputting the index of paraphrases after the extracting, the index identifying the dissimilar-words from the first sentence fragment as paraphrasing the dissimilar-words from the second sentence fragment.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×