×

Web graph compression through scalable pattern mining

  • US 7,912,818 B2
  • Filed: 09/13/2010
  • Issued: 03/22/2011
  • Est. Priority Date: 01/29/2008
  • Status: Expired due to Fees
First Claim
Patent Images

1. A machine-implemented method for compressing a graph including a plurality of nodes and a plurality of links between ones of the plurality of nodes, the machine-implemented method comprising:

  • clustering similar nodes of the plurality nodes;

    scanning a node list, corresponding to the clustered similar nodes of the plurality of nodes to determine frequencies of occurrence of links of the clustered similar nodes;

    sorting a list of the links of the clustered similar nodes based on the determined frequencies of occurrence of the links of the clustered similar nodes to produce a sorted link list;

    adding each of the links, having a frequency of occurrence greater than a predetermined number, from the sorted link list as an item of a prefix tree, along with an associated node list;

    walking the prefix tree to produce a list of candidate virtual nodes;

    determining a respective compression performance for each pattern corresponding to a respective candidate virtual node according to a compression performance formula;

    sorting the list of candidate virtual nodes based on results of applying the compression performance formula;

    selecting one of the list of candidate virtual nodes as a virtual node based on one of the patterns having a highest determined compression performance;

    creating the virtual node; and

    substituting links to the virtual node for links corresponding to the one of the patterns, whereinthe machine-implemented method is implemented either on one processing device or on a networked plurality of processing devices.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×