×

Application-specific method and apparatus for assessing similarity between two data objects

  • US 6,917,952 B1
  • Filed: 11/20/2001
  • Issued: 07/12/2005
  • Est. Priority Date: 05/26/2000
  • Status: Active Grant
First Claim
Patent Images

1. A computer-based method for assessing similarity between two data objects, comprising the steps of:

  • a. training a first predictive model with a first set of data objects of type X and matched data objects of type Y;

    b. using said first predictive model to assess compatibility between each of a plurality of X,Y pairs, wherein for each X,Y pair, each X is a member of a second set of data objects of type X and each Y is a member of a second set of data objects of type Y;

    c. assigning an X,Y compatibility score to each X,Y pair;

    d. comparing the X,Y compatibility scores of each member of the second set of data objects of type X with each other member of the second set of data objects of type X;

    e. pairing each member of the second set of data objects of type X with selected other members of the second set of data objects of type X having similar X,Y compatibility scores to identify a first plurality of X,X pairs, said first plurality of X,X pairs being matched pairs for training a second predictive model;

    f. selecting other ones of the second set of data objects of type X that do not have as similar compatibility scores as the matched pairs to identify a second plurality of X,X pairs, said second plurality of X,X pairs being distracters for training said second predictive model;

    g. deriving a respective set of variables from each member of the second set of data objects of type X;

    h. comparing the respective set of variables derived from each X,X matched pair and from each X,X distracter pair to determine a set of X,X comparisons;

    i. training a second predictive model with said set of X,X comparisons;

    j. receiving two data objects of type X that are not in either the first training dataset or second training dataset;

    k. deriving respective variables from each of said two data objects of type X;

    l. comparing the respective variables derived from each of said two data objects of type X to determine a production X,X comparison; and

    m. running said production X,X comparison through said second predictive model to calculate a similarity score for said two data objects of type X.

View all claims
  • 7 Assignments
Timeline View
Assignment View
    ×
    ×