×

System, method, and recording medium for efficient cohesive subgraph identification in entity collections for inlier and outlier detection

  • US 10,282,636 B2
  • Filed: 10/24/2017
  • Issued: 05/07/2019
  • Est. Priority Date: 09/14/2015
  • Status: Active Grant
First Claim
Patent Images

1. A similarity detection system receiving a plurality of input entities, the system comprising:

  • a face detection and verification device configured to extract a face region from the plurality of input entities and to extract shift features from the face region as images, the face detection and verification device excludes items in the input entities that have facial features but are not a real face of a person;

    a cohesive subgraph identification device configured to calculate, based on attributes of the plurality of input entities, a first parameter and a second parameter based on the first parameter, and further configured to identify a plurality of subgraphs from the second parameter; and

    a subgraph correlation tracking and clustering device configured to determine a relationship between different subgraphs based on a similarity factor between the second parameter and the plurality of subgraphs,wherein the subgraph correlation tracking and clustering device is further configured to output outliers that include the subgraphs not combined by the subgraph correlation tracking and clustering device,wherein the subgraph correlation tracking and clustering device uses source nodes and target nodes of nodes in a post network such that on each iteration, each source node propagates edge similarities to a neighborhood of the source node along paths, and, each target node aggregates the similarities that are propagated in from their neighbors, andwherein the subgraph correlation tracking and clustering device measures the correlations between subgraphs as an amount of average path similarities that are propagated and aggregated to detect outliers within the images and based on the correlation between the subgraphs and similarities, the subgraph correlation tracking and clustering device combines subgraphs to output inliers,wherein the first parameter includes k-cores and the second parameter includes (k, d)-cores,wherein the (k, d)-cores include a cohesive internal structure adjusted by at least d common neighbors as witnesses of commonality between the nodes in the post network connected by each edge,wherein d is an integer greater than zero,wherein given the post network as G(V,E) defined as k-cores from all the nodes represented as G(V,E), the cohesive subgraph identification device recursively removes the nodes with a degree less than k from G(V,E), until all the remaining nodes have a degree of at least k with a result being a set of k-cores that form the basis for the (k, d)-cores generation,wherein the cohesive subgraph identification device uses k from the (k, d)-cores to adjust an edge density, and uses d from the (k, d) cores to control a strength of the similarities between the subgraphs generated and the k-cores, thereby the cohesive subgraph identification device generates a maximal (k, d)-core which is a subgraph of a maximal k-core,wherein the cohesive subgraph identification device generates the (k, d)-core by removing n nodes with a degree less than k recursively by only verifying the degree of n nodes,wherein given a maximal path length, the average path similarities aggregated by each of the target nodes are computed by multiplying all the edge similarities along the path, dampened by a dampening factor on each hop.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×