×

Method of text similarity measurement

  • US 7,346,491 B2
  • Filed: 01/04/2001
  • Issued: 03/18/2008
  • Est. Priority Date: 01/04/2001
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method for estimating the similarity between at least two portions of text, said method comprising the steps of:

  • receiving said at least two portions of text;

    forming a set of syntactic tuples from said portions of text, each tuple comprising two terms and a relation between the two terms;

    classifying the relation between the terms in the tuples according to a predefined set of relations;

    predefining classes of agreement between tuples under comparison, comprising a class of full agreement wherein tuples under comparison are identical, a class of partial agreement wherein only two of corresponding elements in tuples under comparison are identical, and a class of term agreement wherein only one of corresponding terms in tuples under comparison are identical;

    determining a respective class of relative agreement between each pair of syntactic tuples from the portions of text under comparison according to the predefined classes of agreement;

    calculating a value representative of the similarity between the portions of text for each of the classes of agreement, based on the plurality of tuples determined to belong to the respective class of agreement; and

    determining and outputting a measure of the similarity between the portions of text by calculating a weighted sum of the values representative of the similarity between the portions of text for each of the classes of agreement.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×