Methods and apparatus for distributed community finding
First Claim
1. A method comprising:
- maintaining a representation of a distributed user community stored in a database, the distributed user community comprising a plurality of overlapping user communities, wherein each user community comprises a contact list and contact addresses corresponding to the contact list; and
analyzing an email addressed to a user from a source by applying a percolation community finding algorithm to the distributed user community,wherein the percolation community finding algorithm comprises;
initiating a percolation message from a source node of the distributed user community, the distributed user community comprising a plurality of nodes and a plurality of edges, each edge connecting at least two of the plurality of nodes, wherein a node is a neighbor if the node is connected to another node in the plurality of nodes by an edge, wherein the percolation message comprises a percolation probability and an identifier of the source node, and wherein initiating a percolation message from the source node comprises transmitting the percolation message to each neighbor of the source node with the percolation probability;
propagating the percolation message through the distributed user community, wherein propagating the percolation message through the distributed user community comprises;
transmitting the percolation message from each node that receives the percolation message to each neighbor of each node that receives the percolation message; and
transmitting a response to the source node from each node that receives the percolation message;
collecting each response to the percolation message at the source node; and
storing a list of nodes that transmitted the response at the source node.
6 Assignments
0 Petitions
Accused Products
Abstract
Methods and apparatus for a new approach to the problem of finding communities in complex networks relating to a social definition of communities and percolation are disclosed. Instead of partitioning the graph into separate subgraphs from top to bottom a local algorithm (communities of each vertex) allows overlapping of communities. The performance of an algorithm on synthetic, randomly-generated graphs and real-world networks is used to benchmark this method against others. An heuristic is provided to generate a list of communities for networks using a local community finding algorithm. Unlike diffusion based algorithms, The provided algorithm finds overlapping communities and provides a means to measure confidence in community structure. It features locality and low complexity for exploring the communities for a subset of network nodes, without the need for exploring the whole graph.
155 Citations
36 Claims
-
1. A method comprising:
-
maintaining a representation of a distributed user community stored in a database, the distributed user community comprising a plurality of overlapping user communities, wherein each user community comprises a contact list and contact addresses corresponding to the contact list; and analyzing an email addressed to a user from a source by applying a percolation community finding algorithm to the distributed user community, wherein the percolation community finding algorithm comprises; initiating a percolation message from a source node of the distributed user community, the distributed user community comprising a plurality of nodes and a plurality of edges, each edge connecting at least two of the plurality of nodes, wherein a node is a neighbor if the node is connected to another node in the plurality of nodes by an edge, wherein the percolation message comprises a percolation probability and an identifier of the source node, and wherein initiating a percolation message from the source node comprises transmitting the percolation message to each neighbor of the source node with the percolation probability; propagating the percolation message through the distributed user community, wherein propagating the percolation message through the distributed user community comprises; transmitting the percolation message from each node that receives the percolation message to each neighbor of each node that receives the percolation message; and transmitting a response to the source node from each node that receives the percolation message; collecting each response to the percolation message at the source node; and storing a list of nodes that transmitted the response at the source node. - View Dependent Claims (2, 3, 4, 6, 7, 8, 9)
-
-
5. A method comprising:
-
maintaining a representation of a distributed user community stored in a database, the distributed user community comprising a plurality of overlapping user communities, wherein each user community comprises a contact list and contact addresses corresponding to the contact list; and analyzing an email addressed to a user from a source by applying a percolation community finding algorithm to the distributed user community, wherein the percolation community finding algorithm comprises; initiating a plurality of percolation messages from each of a plurality of source nodes at a plurality of percolation probabilities, wherein the plurality of percolation probabilities are selected from a set of values between 0 and 1; determining a list of nodes that received the plurality of percolation messages for each of the plurality of source nodes; aggregating the list of nodes for each of the plurality of source nodes to determine a local neighborhood for each of the plurality of source nodes, wherein each node is weighted based on a number of times each node received the plurality of percolation messages; plotting the local neighborhood size versus percolation probabilities for each of the local neighborhoods in a plot; locating phase transition points in the plot; and determining nested local communities for the plurality of source nodes by combining the local neighborhoods at the phase transition points.
-
-
10. A non-transitory program storage device readable by a machine, embodying a program of instructions executable by the machine to perform a method, the method comprising:
-
maintaining a representation of a distributed user community stored in a database, the distributed user community comprising a plurality of overlapping user communities, wherein each user community comprises a contact list and contact addresses corresponding to the contact list; and analyzing an email addressed to a user from a source by applying a percolation community finding algorithm to the distributed user community, wherein the percolation community finding algorithm comprises; initiating a percolation message from a source node of the distributed user community, the distributed user community comprising a plurality of nodes and a plurality of edges, each edge connecting at least two of the plurality of nodes, wherein a node is a neighbor if the node is connected to another node in the plurality of nodes by an edge, wherein the percolation message comprises a percolation probability and an identifier of the source node, and wherein initiating a percolation message from the source node comprises transmitting the percolation message to each neighbor of the source node with the percolation probability; propagating the percolation message through the distributed user community, wherein propagating the percolation message through the distributed user community comprises; transmitting the percolation message from each node that receives the percolation message to each neighbor of each node that receives the percolation message; and transmitting a response to the source node from each node that receives the percolation message; collecting each response to the percolation message at the source node; and storing a list of nodes that transmitted the response at the source node. - View Dependent Claims (11, 12, 13, 15, 16, 17, 18)
-
-
14. A non-transitory program storage device readable by a machine, embodying a program of instructions executable by the machine to perform a method, the method comprising:
-
maintaining a representation of a distributed user community stored in a database, the distributed user community comprising a plurality of overlapping user communities, wherein each user community comprises a contact list and contact addresses corresponding to the contact list; and analyzing an email addressed to a user from a source by applying a percolation community finding algorithm to the distributed user community, wherein the percolation community finding algorithm comprises; initiating a plurality of percolation messages from each of a plurality of source nodes at a plurality of percolation probabilities, wherein the plurality of percolation probabilities are selected from a set of values between 0 and 1; determining a list of nodes that received the plurality of percolation messages for each of the plurality of source nodes; aggregating the list of nodes for each of the plurality of source nodes to determine a local neighborhood for each of the plurality of source nodes, wherein each node is weighted based on a number of times each node received the plurality of percolation messages; plotting the local neighborhood size versus percolation probabilities for each of the local neighborhoods in a plot; locating phase transition points in the plot; and determining nested local communities for the plurality of source nodes by combining the local neighborhoods at the phase transition points.
-
-
19. An apparatus comprising:
-
memory; one or more computers configured to; maintain a representation of a distributed user community stored in a database, the distributed user community comprising a plurality of overlapping user communities, wherein each user community comprises a contact list and contact addresses corresponding to the contact list; and analyze an email addressed to a user from a source by applying a percolation community finding algorithm to the distributed user community, wherein the percolation community finding algorithm comprises; initiating a percolation message from a source node of the distributed user community, the distributed user community comprising a plurality of nodes and a plurality of edges, each edge connecting at least two of the plurality of nodes, wherein a node is a neighbor if the node is connected to another node in the plurality of nodes by an edge, wherein the percolation message comprises a percolation probability and an identifier of the source node, and wherein initiating a percolation message from the source node comprises transmitting the percolation message to each neighbor of the source node with the percolation probability; propagating the percolation message through the distributed user community, wherein propagating the percolation message through the distributed user community comprises; transmitting the percolation message from each node that receives the percolation message to each neighbor of each node that receives the percolation message; and transmitting a response to the source node from each node that receives the percolation message; collecting each response to the percolation message at the source node; and storing a list of nodes that transmitted the response at the source node. - View Dependent Claims (20, 21, 22, 24, 25, 26, 27)
-
-
23. An apparatus comprising:
-
memory; one or more computers configured to; maintain a representation of a distributed user community stored in a database, the distributed user community comprising a plurality of overlapping user communities, wherein each user community comprises a contact list and contact addresses corresponding to the contact list; and analyze an email addressed to a user from a source by applying a percolation community finding algorithm to the distributed user community, wherein the percolation community finding algorithm comprises; initiating a plurality of percolation messages from each of a plurality of source nodes at a plurality of percolation probabilities, wherein the plurality of percolation probabilities are selected from a set of values between 0 and 1; determining a list of nodes that received the plurality of percolation messages for each of the plurality of source nodes; aggregating the list of nodes for each of the plurality of source nodes to determine a local neighborhood for each of the plurality of source nodes, wherein each node is weighted based on a number of times each node received the plurality of percolation messages; plotting the local neighborhood size versus percolation probabilities for each of the local neighborhoods in a plot; locating phase transition points in the plot; and determining nested local communities for the plurality of source nodes by combining the local neighborhoods at the phase transition points.
-
-
28. An apparatus comprising:
-
means for maintaining a representation of a distributed user community stored in a database, the distributed user community comprising a plurality of overlapping user communities, wherein each user community comprises a contact list and contact addresses corresponding to the contact list; and means for analyzing an email addressed to a user from a source by applying a percolation community finding algorithm to the distributed user community, wherein the percolation community finding algorithm comprises; initiating a percolation message from a source node of the distributed user community, the distributed user community comprising a plurality of nodes and a plurality of edges, each edge connecting at least two of the plurality of nodes, wherein a node is a neighbor if the node is connected to another node in the plurality of nodes by an edge, wherein the percolation message comprises a percolation probability and an identifier of the source node, and wherein initiating a percolation message from the source node comprises transmitting the percolation message to each neighbor of the source node with the percolation probability; propagating the percolation message through the distributed user community, wherein the means for propagating the percolation message through the distributed user community comprises; transmitting the percolation message from each node that receives the percolation message to each neighbor of each node that receives the percolation message; and transmitting a response to the source node from each node that receives the percolation message; collecting each response to the percolation message at the source node; and storing a list of nodes that transmitted the response at the source node. - View Dependent Claims (29, 30, 31, 33, 34, 35, 36)
-
-
32. An apparatus comprising:
-
means for maintaining a representation of a distributed user community stored in a database, the distributed user community comprising a plurality of overlapping user communities, wherein each user community comprises a contact list and contact addresses corresponding to the contact list; and means for analyzing an email addressed to a user from a source by applying a percolation community finding algorithm to the distributed user community, wherein the percolation community finding algorithm comprises; initiating a plurality of percolation messages from each of a plurality of source nodes at a plurality of percolation probabilities, wherein the plurality of percolation probabilities are selected from a set of values between 0 and 1; determining a list of nodes that received the plurality of percolation messages for each of the plurality of source nodes; aggregating the list of nodes for each of the plurality of source nodes to determine a local neighborhood for each of the plurality of source nodes, wherein each node is weighted based on a number of times each node received the plurality of percolation messages; plotting the local neighborhood size versus percolation probabilities for each of the local neighborhoods in a plot; locating phase transition points in the plot; and determining nested local communities for the plurality of source nodes by combining the local neighborhoods at the phase transition points.
-
Specification