Method for identifying network similarity by matching neighborhood topology
First Claim
1. A non-transitory computer-readable storage medium storing a computer readable program of computer instructions, wherein the computer readable program when executed on a computer causes the computer to carry out operations to identify a common subgraph between first and second networks, the first network G1 having a set of nodes V1 and a set of edges E1, the second network G2 having a set of nodes V2 and a set of edges E2, where N(a) is a set of neighbors of a given node a, the set being of size |N(a)|, with each edge e of a network having an edge weight w(e), the operations comprising:
- for each of a set of node pairs (i, j), where i is a node from the first network and j is a node from the second network, and where u is a neighbor of i and v is a neighbor of j, computing a similarity score Ri,j equal to a support value provided to the node pair (i, j) by |N(i)∥
N(j)| possible matches between neighbors of i and j, where each neighboring node pair (u,v) distributes back its score Ruv among |N(u)∥
N(v)| possible matches between neighbors of u and v;
using the similarity scores to construct the common subgraph.
0 Assignments
0 Petitions
Accused Products
Abstract
A method of computing a measure of similarity between nodes of first and second networks is described. In particular, sets of pairwise scores are computed to find nodes in the individual networks that are good matches to one another. Thus, a pairwise score, referred to as Rij, is computed for a node i in the first network and a node j in the second network. Similar pairwise scores are computed for each of the nodes in each network. The goal of this process is to identify node pairs that exhibit high Rij values. According to the technique described herein, the intuition is that nodes i and j are a good match if their neighbors are a good match. This technique produces a measure of “network similarity.” If node feature data also is available, the intuition may be expanded such that nodes i and j are considered a good match if their neighbors are a good match (network similarity) and their node features are a good match (node similarity). Node feature data typically is domain-specific. Using the similarity scores, a common subgraph between the first and second networks then can be computed.
-
Citations
21 Claims
-
1. A non-transitory computer-readable storage medium storing a computer readable program of computer instructions, wherein the computer readable program when executed on a computer causes the computer to carry out operations to identify a common subgraph between first and second networks, the first network G1 having a set of nodes V1 and a set of edges E1, the second network G2 having a set of nodes V2 and a set of edges E2, where N(a) is a set of neighbors of a given node a, the set being of size |N(a)|, with each edge e of a network having an edge weight w(e), the operations comprising:
-
for each of a set of node pairs (i, j), where i is a node from the first network and j is a node from the second network, and where u is a neighbor of i and v is a neighbor of j, computing a similarity score Ri,j equal to a support value provided to the node pair (i, j) by |N(i)∥
N(j)| possible matches between neighbors of i and j, where each neighboring node pair (u,v) distributes back its score Ruv among |N(u)∥
N(v)| possible matches between neighbors of u and v;using the similarity scores to construct the common subgraph. - View Dependent Claims (2, 3, 4, 5)
-
-
6. An article comprising a non-transitory tangible machine-readable medium that stores a program, the program being executable by a machine to perform a method of identifying a graph that is substantially isomorphic to subgraphs of first and second networks, the first network G1 having a set of nodes V1 and a set of edges E1, the second network G2 of nodes V2 and a set of edges E2, where N(a) is a set of neighbors of a given node a, the set being of size |N(a)|, with each edge e of a network having an edge weight w(e), the method comprising:
-
requiring a set of constraints Ri,j to hold for all possible pairs of (i, j), where i is a node from the first network and j is a node from the second network, and where u is a neighbor of i and v is a neighbor of j, where;
Rij=Σ
uε
N(i)Σ
vε
N(j)[w(i,u)w(j,v)/(Σ
rε
N(i)w(r,u)Σ
qε
N(j)w(q,v))]Ruvwhere iε
V1 and jε
V2;computing a vector R by solving the constraints; and extracting node mappings from vector R to identify the graph. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A computer program product comprising a non-transitory tangible machine-readable medium that stores a program, the program being executable by a machine to perform a method of identifying a common subgraph between first and second networks, the first network G1 having a set of nodes V1 and a set of edges E1, the second network G2 having a set of nodes V2 and a set of edges E2, where N(a) is a set of neighbors of a given node a, the set being of size |N(a)|, with each edge e of a network having an edge weight w(e), the method comprising:
-
for each of a set of node pairs (i, j), where i is a node from the first network and j is a node from the second network, and where u is a neighbor of i and v is a neighbor of j, computing a similarity score Ri,j according to the following equation;
Rij=Σ
uε
N(i)Σ
vε
N(j)[1/(|N(u)∥
N(v)|)]Ruvwhere iε
V1 and jε
V2; andusing the similarity scores to identify the common subgraph. - View Dependent Claims (12, 13, 14, 15, 16, 17)
-
-
18. A non-transitory computer-readable storage medium storing a computer readable program of computer instructions, wherein the computer readable program when executed on a computer causes the computer to carry out operations to match first and second networks, the first network G1 having a set of nodes V1 and a set of edges E1, the second network G2 of nodes V2, where N(a) is a set of neighbors of a given node a, the set being of size |N(a)|, with each edge e having an edge weight w(e), the operations comprising:
-
establishing a set of constraints as a convex combination of a set of network similarity scores and a set of node similarity scores, wherein the set of constraints conform to the following equation;
Rij=α
(Σ
uε
N(i)Σ
vε
N(j)[w(i,u)w(j,v)/Σ
rε
N(i)w(r,u)Σ
qε
N(j)w(q,v)]Ruv)+(1−
α
)Bijwhere B is a set of node similarity scores between the nodes of the first and second networks, with score scaled by a uniform multiple such that Σ
Bij=1, where iε
V1 and jε
V2 and where 0<
α
≦
1;computing a vector R by identifying a principal eigenvector of a matrix A[i,j][u,v], where A[i,j][u,v]=(α
Σ
uε
N(i)Σ
vε
N(j)[w(i,u)w(j,v)/Σ
rε
N(i)w(r,u)Σ
qε
N(j)w(q,v)]+(1−
α
)Bij), if (i,u)ε
E1 and (j,v)ε
E2, and 0 otherwise, where A is a |V1∥
V2|×
|V1∥
V2| matrix and A[i,j][u,v] is an entry at row (i,j) and column (u,v); andextracting node mappings from vector R to match the first and second networks.
-
-
19. A computer program product comprising a non-transitory tangible machine-readable medium that stores a program, the program being executable by a machine to perform a method of identifying a measure of similarity between nodes of first and second networks, the method comprising:
-
identifying a node pair (i, j), where i is a node from the first network and j is a node from the second network, and where u is a neighbor of i and v is a neighbor of j; and computing a network similarity score Ri,j for the node pair to be equal to a total support provided by all supporting node pairs adjacent to the node pair, each supporting node pair providing its support in proportion to a number of supporting node pairs it has to support; wherein Rij=Σ
uε
N(i)Σ
vε
N(i)[1/(|N(u)∥
N(v)|)] Ruv where iε
V1 and jε
V2.
-
-
20. A computer program product comprising a non-transitory tangible machine-readable medium that stores a program, the program being executable by a machine to perform a method of identifying a measure of similarity among nodes of multiple networks, the method comprising:
-
for each pair of networks, generating pairwise alignment data by; identifying a node pair (i, j), where i is a node from a first network and j is a node from a second network, and where u is a neighbor of i and v is a neighbor of j; and computing a network similarity score Ri,j for the node pair to be equal to a total support provided by all supporting node pairs adjacent to the node pair, each supporting node pair providing its support in proportion to a number of supporting node pairs it has to support; wherein Rij=Σ
uε
N(i)Σ
vε
N(i)[1/(|N(u)∥
N(v)|)] Ruv where iε
V1 and jε
V2; andapplying an algorithm to the pairwise alignment data to find alignment data for the multiple networks. - View Dependent Claims (21)
-
Specification