Method for identifying network similarity by matching neighborhood topology
First Claim
1. A method of identifying a common subgraph between first and second networks, the first network G1 having a set of nodes V1 and a set of edges E1, the second network G2 having a set of nodes V2 and a set of edges E2, where N(a) is a set of neighbors of a given node a, the set being of size |N(a)|, with each edge e of a network having an edge weight w(e), the method comprising:
- for each of a set of node pairs (i, j), where i is a node from the first network and j is a node from the second network, and where U is a neighbor of i and v is a neighbor of j, computing a similarity score Rij equal to a support value provided to the node pair (i, j) by |N(i)∥
N(j)| possible matches between neighbors of i and j, where each neighboring node pair (u,v) distributes back its score Ruv among |N(u)∥
N(v)| possible matches between neighbors of u and v;
using the similarity scores to construct the common subgraph.
0 Assignments
0 Petitions
Accused Products
Abstract
A method of computing a measure of similarity between nodes of first and second networks is described. In particular, sets of pairwise scores are computed to find nodes in the individual networks that are good matches to one another. Thus, a pairwise score, referred to as Rij, is computed for a node i in the first network and a node j in the second network. Similar pairwise scores are computed for each of the nodes in each network. The goal of this process is to identify node pairs that exhibit high Rij values. According to the technique described herein, the intuition is that nodes i and j are a good match if their neighbors are a good match. This technique produces a measure of “network similarity.” If node feature data also is available, the intuition may be expanded such that nodes i and j are considered a good match if their neighbors are a good match (network similarity) and their node features are a good match (node similarity). Node feature data typically is domain-specific. Using the similarity scores, a common subgraph between the first and second networks then can be computed.
-
Citations
19 Claims
-
1. A method of identifying a common subgraph between first and second networks, the first network G1 having a set of nodes V1 and a set of edges E1, the second network G2 having a set of nodes V2 and a set of edges E2, where N(a) is a set of neighbors of a given node a, the set being of size |N(a)|, with each edge e of a network having an edge weight w(e), the method comprising:
-
for each of a set of node pairs (i, j), where i is a node from the first network and j is a node from the second network, and where U is a neighbor of i and v is a neighbor of j, computing a similarity score Rij equal to a support value provided to the node pair (i, j) by |N(i)∥
N(j)| possible matches between neighbors of i and j, where each neighboring node pair (u,v) distributes back its score Ruv among |N(u)∥
N(v)| possible matches between neighbors of u and v;using the similarity scores to construct the common subgraph. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method of identifying a graph that is substantially isomorphic to subgraphs of first and second networks, the first network G1 having a set of nodes V1 and a set of edges E1 the second network G2 of nodes V2 and a set of edges E2 where N(a) is a set of neighbors of a given node a, the set being of size |N(a)|, with each edge e of a network having an edge weight w(e), the method comprising:
-
requiring a set of constraints Ri,j to hold for all possible pairs of (i, j), where i is a node from the first network and j is a node from the second network, and where U is a neighbor of i and v is a neighbor of j, where;
Rij=Σ
uε
N(i)Σ
vε
N(j)[w(i,u)w(j,v)/(Σ
rε
N(i)w(r,u)Σ
qε
N(j)w(q,v))]Ruvwhere iε
V1 and jε
V2;computing a vector R by solving the constraints; and extracting node mappings from vector R to identify the graph. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A method of identifying a common subgraph between first and second networks, the first network G1 having a set of nodes V1 and a set of edges E1, the second network G2 having a set of nodes V2 and a set of edges E2, where N(a) is a set of neighbors of a given node a, the set being of size |N(a)|, with each edge e of a network having an edge weight w(e), comprising:
-
for each of a set of node pairs (i, j), where i is a node from the first network and j is a node from the second network, and where U is a neighbor of i and v is a neighbor of j, computing a similarity score Ri,j according to the following equation;
Rij=Σ
uε
N(i)Σ
vε
N(j)[1/(|N(u)∥
N(v)|)]Ruv where iε
V1 and jε
V2; andusing the similarity scores to identify the common subgraph. - View Dependent Claims (12, 13, 14, 15, 16, 17)
-
-
18. A computer-implemented method of matching first and second networks, the first network G1 having a set of nodes V1 and a set of edges E1, the second network G2 of nodes V2, where N(a) is a set of neighbors of a given node a, the set being of size |N(a)|, with each edge e having an edge weight w(e), comprising:
-
establishing a set of constraints as a convex combination of a set of network similarity scores and a set of node similarity scores, wherein the set of constraints conform to the following equation;
Rij=α
(Σ
uε
N(i)Σ
vε
N(j)[w(i,u)w(j,v)/Σ
rε
N(i)w(r,u)Σ
qε
N(j)w(q,v)]Ruv)+(1−
α
)Bijwhere B is a set of node similarity scores between the nodes of the first and second networks, with score scaled by a uniform multiple such that Σ
Bij=1, where iε
V1 and jε
V2 and where 0<
α
≦
1;computing a vector R by identifying a principal eigenvector of a matrix A[i, j][u,v], where A[i, j][u,v]=(α
Σ
uε
N(i)Σ
vε
N(j)[w(i,u)w(j,v)/Σ
rε
N(i)w(r,u)Σ
qε
N(j)w(q,v)]+(1−
α
) Bij), if (i,u)ε
E1 and (j,v)/Σ
ε
E2, and 0 otherwise, where A is a |V1∥
V2|×
|V∥
V2| matrix and A[i, j] [u,v] is an entry at row (i,j) and column (u,v); andextracting node mappings from vector R to match the first and second networks.
-
-
19. A computer-implemented method of identifying a measure of similarity between nodes of first and second networks, comprising:
-
identifying a node pair (i, j), where i is a node from the first network and j is a node from the second network; and computing a network similarity score for the node pair to be equal to a total support provided by all supporting node pairs adjacent to the node pair, each supporting node pair providing its support in proportion to a number of supporting node pairs it has to support.
-
Specification