×

Method and system for analyzing similarity of concept sets

  • US 20080195587A1
  • Filed: 02/13/2007
  • Published: 08/14/2008
  • Est. Priority Date: 02/13/2007
  • Status: Active Grant
First Claim
Patent Images

1. A system comprising:

  • a concept analysis engine including;

    a taxonomy manager configured to obtain a set of one or more taxonomies wherein each of the taxonomies includes one root node and one or more hierarchically ordered paths, wherein each hierarchically ordered path includes the root node and a hierarchically ordered sequence of concept nodes;

    a concept set engine configured to receive a first set of first set concepts and a second set of second set concepts;

    a concept pair engine configured to determine a plurality of concept pairs, wherein each concept pair includes one of the first set concepts and one of the second set concepts;

    a hierarchical path engine configured to determine, for each one of the concept pairs, an associated length of a nondiverging intersection of a first subpath of one of the hierarchically ordered paths from the root node of one of the taxonomies to a first concept node representing the first set concept and a second subpath of one of the hierarchically ordered paths from the root node of the one of the taxonomies to a second concept node representing the second set concept, and an associated length of a first portion of the first subpath from a last concept node included in the nondiverging intersection to the first concept node, and an associated length of a second portion of the second subpath from the last concept node included in the nondiverging intersection to the second concept node;

    a concept similarity engine configured to determine pairwise similarity values associated with each of the concept pairs based on ratios based on associated lengths of nondiverging intersections determined by the hierarchical path engine and the associated lengths of the first and second portions, wherein a pairwise similarity value indicating a high similarity is determined for association with concept pairs associated with nonempty nondiverging intersections including the root node and hierarchically immediate successor nodes of the root node that are included in the first subpath and the second subpath; and

    a concept set similarity engine configured to determine a concept set similarity value based on a weighted sum of the pairwise similarity values associated with optimal selected ones of the concept pairs.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×