×

Social identity clustering

  • US 8,626,835 B1
  • Filed: 10/21/2010
  • Issued: 01/07/2014
  • Est. Priority Date: 10/21/2010
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method executed using one or more processors, the method comprising:

  • receiving, by the one or more processors, a connection graph representing public social data comprising identity data and social link data, wherein the connection graph comprises nodes and directed edges, each edge connecting a pair of nodes, wherein each node represents an identity extracted from the public social data and each edge represents a social link extracted from the public social data from one identity to another different identity, wherein each node represents a distinct identity, and wherein each edge is a “

    me”

    edge if the edge represents a “

    me”

    social link between identities claimed by a link author as belonging to the same person or real-life entity, and a “

    friend”

    edge represents a “

    friend”

    social link between identities claimed by a link author as belonging to different persons or entities;

    converting, at the one or more processors, the connection graph to a cluster graph, the cluster graph comprising clusters and cluster edges, each edge connecting a pair of clusters, each cluster corresponding to one or more nodes of the connection graph, and each edge of the cluster graph corresponding to one or more edges of the connection graph, wherein each cluster of the cluster graph initially corresponds to a single node of the connection graph and each edge of the cluster graph initially corresponds to a single edge of the connection graph; and

    clustering, at the one or more processors, identities by iteratively performing the following actions (i) and (ii) until no pair of clusters exists in the cluster graph that satisfies the threshold fraction condition of action (ii);

    (i) identifying all outgoing “

    me”

    edges from a first cluster and all “

    me”

    edges to a second cluster, wherein each “

    me”

    edge has a respective weight; and

    (ii) determining that the weights of all “

    me”

    edges from the first cluster to the second cluster are more than a threshold fraction of the weights of all outgoing “

    me”

    edges from the first cluster and as a consequence merging the first cluster and the second cluster to form a new cluster replacing the first and second clusters, wherein the edges of the new cluster node are aggregations of the edges of the former first cluster and the edges of the former second cluster, and wherein the new cluster represents the identities of the former first cluster and the identities of the former second cluster.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×