System and method for supervised network clustering
First Claim
Patent Images
1. An apparatus, comprising:
- a processor; and
a memory device, said memory device storing therein a set of machine-readable instructions permitting said processor to execute a method of supervised network clustering, said method comprising;
receiving and reading node labels from a plurality of nodes on a network, as executed by a processor on a computer having access to said network, said node labels comprising a set of keywords describing each node;
using said node labels to calculate a density for each node by using a random walk process as a measure of probability of traversing from said node to adjacent nodes in said network, to define densities associated with said nodes; and
iteratively extracting node cluster components from the network as network clusters, based on using thresholds on node densities, wherein a threshold for an iteration comprises an average density over all remaining nodes,wherein said threshold is user-defined and smaller-cluster components having a size less than a user-defined threshold value are merged with larger cluster components, wherein each smaller cluster component is merged to a cluster component with which it has a largest number of connections, the merging thereby reducing a noise in the network clustering and adapting the network clustering to local densities in said network.
1 Assignment
0 Petitions
Accused Products
Abstract
A method (and system) for supervised network clustering includes receiving and reading node labels from a plurality of nodes on a network, as executed by a processor on a computer having access to the network, the network defined as a group of entities interconnected by links. The node labels are used to define densities associated with the nodes. Node components are extracted from the network, based on using thresholds on densities. Smaller components having a size below a user-defined threshold are merged.
28 Citations
13 Claims
-
1. An apparatus, comprising:
-
a processor; and a memory device, said memory device storing therein a set of machine-readable instructions permitting said processor to execute a method of supervised network clustering, said method comprising; receiving and reading node labels from a plurality of nodes on a network, as executed by a processor on a computer having access to said network, said node labels comprising a set of keywords describing each node; using said node labels to calculate a density for each node by using a random walk process as a measure of probability of traversing from said node to adjacent nodes in said network, to define densities associated with said nodes; and iteratively extracting node cluster components from the network as network clusters, based on using thresholds on node densities, wherein a threshold for an iteration comprises an average density over all remaining nodes, wherein said threshold is user-defined and smaller-cluster components having a size less than a user-defined threshold value are merged with larger cluster components, wherein each smaller cluster component is merged to a cluster component with which it has a largest number of connections, the merging thereby reducing a noise in the network clustering and adapting the network clustering to local densities in said network. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A server comprising:
-
an input port to receive information concerning nodes on a network; and a processor, wherein said processor; receives, via said input port, and reads node labels from a plurality of nodes on the network, said network defined by a group of entities connected by links, said node labels comprising a set of keywords describing each node; calculates a random-walk-based probability for each said node on said network, to define densities associated with said nodes; and defines clusters of nodes in said network based on said densities, wherein said clusters are extracted based on using a threshold value of said densities, wherein said threshold value initially comprises an average density of said network, wherein each node in said network is placed into one of said clusters, and wherein said cluster extraction comprises an iterative process, wherein said threshold value for each said iteration comprises an average density of remaining nodes across said network during that iteration; merging smaller clusters with sizes below a user-defined threshold value, the merging thereby reducing a noise in the network clustering and adapting the network clustering to local densities in said network. - View Dependent Claims (10)
-
-
11. A computer, comprising:
-
A processor having access to a network, said network defined by a group of entities interconnected by links; and a memory device, said memory device storing a set of computer-readable instructions for said processor to execute an iterative method of clustering of said entities, said method comprising; calculating a density associated with each said entity on said network, as executed by said processor; and defining clusters of entities in said network based on said densities, wherein further defining comprise a cluster extraction based on using a threshold value of said densities, wherein said threshold value initially comprises an average density of said network, wherein said densities are calculated as a random-walk-based probability for each said node on said network, corresponding to a PageRank computation on node labels of said nodes, said node labels comprising a set of keywords describing each node, and wherein said defining of clusters comprises an iterative process until all nodes of said plurality of nodes are included in a cluster of a predefined minimum cluster size, wherein said threshold value for each said iteration comprises an average density of remaining nodes across said network during that iteration; merging smaller clusters with sizes below a user-defined threshold value, the merging thereby reducing a noise in the network clustering and adapting the network clustering to local densities in said network. - View Dependent Claims (12, 13)
-
Specification