Node clustering

US 8,572,239 B2
Filed: 09/20/2010
Issued: 10/29/2013
Est. Priority Date: 09/20/2010
Status: Active Grant

First Claim

Patent Images

1. A method, implemented at least in part via a processing unit, for identifying one or more node clusters, comprising:

receiving a set of node pairings corresponding to a plurality of nodes, a node pairing comprising a first node connected to a second node; and

transforming the set of node pairings one or more times until one or more node clusters are identified from the plurality of nodes, where nodes within a node cluster are paired with a base node within the plurality of nodes by a single connection and are connected to no other nodes, the transforming comprising;

determining a current node, within the plurality of nodes, that is paired with two or more neighboring nodes, the two or more neighboring nodes comprising a first neighboring node and a second neighboring node;

comparing a first value comprised in the first neighboring node to a second value comprised in the second neighboring node;

determining that the first value is smaller than the second value based upon the comparing;

determining that the first neighboring node is a reference node based upon the determination that the first value is smaller than the second value; and

based upon the determination that the first value is smaller than the second value;

disconnecting the second neighboring node, but not the first neighboring node, which is the reference node, from the current node; and

connecting the second neighboring node to the first neighboring node,the current node, the first neighboring node and the second neighboring node representing a common type, the common type comprising at least one of;

a user ID;

a login ID;

a cookie ID;

a mobile phone ID;

oran IP address.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Large sets of unorganized data may provide little value in identifying useful observations from such data. For example, an online merchant may maintain a database of millions of user IDs (e.g., a cookie ID, a login ID, a device ID, a network ID, etc.) along with content viewed and/or actions taken with the user IDs, where minimal associations are known between user IDs. It may be advantageous to link together user IDs of respective users to capture a comprehensive view of respective users'"'"' activities. Accordingly, one or more systems and/or techniques for identifying a cluster of nodes based upon transforming a set of node pairings (e.g., pairings of related nodes) one or more times are disclosed herein. Iterative transformations may be performed until respective nodes are paired with merely their smallest neighboring node and are paired with no other node. In this way, node clusters may be identifiable.

Citations

20 Claims

1. A method, implemented at least in part via a processing unit, for identifying one or more node clusters, comprising:
- receiving a set of node pairings corresponding to a plurality of nodes, a node pairing comprising a first node connected to a second node; and
  
  transforming the set of node pairings one or more times until one or more node clusters are identified from the plurality of nodes, where nodes within a node cluster are paired with a base node within the plurality of nodes by a single connection and are connected to no other nodes, the transforming comprising;
  
  determining a current node, within the plurality of nodes, that is paired with two or more neighboring nodes, the two or more neighboring nodes comprising a first neighboring node and a second neighboring node;
  
  comparing a first value comprised in the first neighboring node to a second value comprised in the second neighboring node;
  
  determining that the first value is smaller than the second value based upon the comparing;
  
  determining that the first neighboring node is a reference node based upon the determination that the first value is smaller than the second value; and
  
  based upon the determination that the first value is smaller than the second value;
  
  disconnecting the second neighboring node, but not the first neighboring node, which is the reference node, from the current node; and
  
  connecting the second neighboring node to the first neighboring node,the current node, the first neighboring node and the second neighboring node representing a common type, the common type comprising at least one of;
  
  a user ID;
  
  a login ID;
  
  a cookie ID;
  
  a mobile phone ID;
  
  oran IP address.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. The method of claim 1, comprising:
    - assigning hash values to respective nodes within the plurality of nodes.
  - 3. The method of claim 2, the first neighboring node comprising a hash value smaller than a hash value of the current node.
  - 4. The method of claim 2, the disconnecting comprising:
    - if a hash value of the first neighboring node is larger than or equal to a hash value of the current node, then refraining from disconnecting the second neighboring node from the current node and refraining from connecting the second neighboring node to the first neighboring node.
  - 5. The method of claim 2, the base node of the node cluster comprising a smaller hash value than respective hash values of other nodes within the node cluster.
  - 6. The method of claim 1, the node pairing comprising:
    - a third node corresponding to a user ID and a fourth node corresponding to an IP address, ora fifth node corresponding to an IP address and a sixth node corresponding to a user ID.
  - 7. The method of claim 1, the transforming comprising:
    - concurrently operating on a plurality of current nodes in parallel.
  - 8. The method of claim 1, nodes within the plurality of nodes representing individuals, and connections between nodes representing relationships.
  - 9. The method of claim 1, comprising:
    - identifying a second node cluster from the plurality of nodes, where nodes within the second node cluster are paired with a second base node within the plurality of nodes by a single connection and are connected to no other nodes.
  - 10. The method of claim 1, a node within the node cluster comprising a symmetric and transitive relationship with respective nodes within the node cluster.
  - 11. The method of claim 1, comprising:
    - referencing the node cluster based upon the base node.
  - 12. The method of claim 1, the node cluster corresponding to an individual and nodes within the node cluster representing descriptive data corresponding to the individual.

13. A system for identifying one or more node clusters, comprising:
- a transformation module configured to;
  
  receive a set of node pairings corresponding to a plurality of nodes, a node pairing comprising a first node connected to a second node;
  
  transform the set of node pairings one or more times until one or more node clusters are identified from the plurality of nodes, where nodes within a node cluster are paired with a base node within the plurality of nodes by a single connection and are connected to no other nodes, a transformation comprising;
  
  determining a current node, within the plurality of nodes, that is paired with two or more neighboring nodes, the two or more neighboring nodes comprising a first neighboring node and a second neighboring node;
  
  comparing a first value comprised in the first neighboring node to a second value comprised in the second neighboring node;
  
  determining that the first value is smaller than the second value based upon the comparing;
  
  determining that the first neighboring node is a reference node based upon the determination that the first value is smaller than the second value; and
  
  based upon the determination that the first value is smaller than the second value;
  
  disconnecting the second neighboring node, but not the first neighboring node, which is the reference node, from the current node; and
  
  connecting the second neighboring node to the first neighboring node,the current node, the first neighboring node and the second neighboring node representing a common type, the common type comprising at least one of;
  
  a user ID;
  
  a login ID;
  
  a cookie ID;
  
  a mobile phone ID;
  
  oran IP address.
- View Dependent Claims (14, 15, 16, 17, 18, 19)
- - 14. The system of claim 13, comprising:
    - a hashing module configured to;
      
      assign hash values to respective nodes within the plurality of nodes.
  - 15. The system of claim 14, the first neighboring node comprising a hash value smaller than a hash value of the current node.
  - 16. The system of claim 14, the transformation module configured to:
    - refrain from disconnecting the second neighboring node from the current node and refrain from connecting the second neighboring node to the first neighboring node if a hash value of the first neighboring node is larger than or equal to a hash value of the current node.
  - 17. The system of claim 14, the base node comprising a hash value smaller than respective hash values of other nodes within the node cluster.
  - 18. The system of claim 13, the transformation module configured to:
    - concurrently operate on a plurality of current nodes in parallel.
  - 19. The system of claim 13, the transformation module configured to:
    - identify a second node cluster from the plurality of nodes, where nodes within the second node cluster are paired with a second base node within the plurality of nodes by a single connection and are connected to no other nodes.

20. A computer readable storage device comprising instructions that when executed, perform a method for identifying a cluster of nodes, comprising:
- receiving a set of node pairings corresponding to a plurality of nodes, a node pairing comprising a first node connected to a second node; and
  
  transforming the set of node pairings one or more times until one or more node clusters are identified from the plurality of nodes, where nodes within a node cluster are paired with a base node within the plurality of nodes by a single connection and are connected to no other nodes, the transforming comprising;
  
  determining a current node, within the plurality of nodes, that is paired with two or more neighboring nodes, the two or more neighboring nodes comprising a first neighboring node and a second neighboring node;
  
  comparing a first value comprised in the first neighboring node to a second value comprised in the second neighboring node;
  
  determining that the first value is smaller than the second value based upon the comparing;
  
  determining that the first neighboring node is a reference node based upon the determination that the first value is smaller than the second value; and
  
  based upon the determination that the first value is smaller than the second value;
  
  disconnecting the second neighboring node, but not the first neighboring node, which is the reference node, from the current node; and
  
  connecting the second neighboring node to the first neighboring node,the current node, the first neighboring node and the second neighboring node representing a common type, the common type comprising at least one of;
  
  a user ID;
  
  a login ID;
  
  a cookie ID;
  
  a mobile phone ID;
  
  oran IP address.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Cao, Fei, Zhou, Shaoyu, Zhang, Sijian, Roy, Siddhartha, Elizarov, Michael A., Wu, Zhuoqing Jr.
Primary Examiner(s)
Phillips, Hassan
Assistant Examiner(s)
RAHGOZAR, OMEED DANIEL

Application Number

US12/885,897
Publication Number

US 20120072554A1
Time in Patent Office

1,135 Days
Field of Search

None
US Class Current

709/224
CPC Class Codes

G06Q 30/02 Marketing; Price estimation...

G06Q 30/0201 Market modelling; Market an...

Node clustering

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Node clustering

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links