×

Efficient storage and retrieval of posting lists

  • US 8,229,970 B2
  • Filed: 08/29/2008
  • Issued: 07/24/2012
  • Est. Priority Date: 08/31/2007
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method for storing and retrieving one or more posting lists from a physical storage medium, the method comprising performing computer-implemented operations for:

  • storing a hierarchy of semantic roles comprising a role tree having nodes related to one another in a strict dominance hierarchy, each of the nodes in the role tree corresponding to a semantic role and being associated with one or more terms;

    generating a posting list for each association of a term and a semantic role in the hierarchy, each posting list comprising data identifying one or more documents that include the usage of the term in the associated semantic role;

    storing the posting lists contiguously on a physical storage medium by performing a pre-order depth-first traversal of the nodes of the role tree, and at each of the nodes, writing the posting list for a term associated with the semantic role corresponding to the node to the physical storage medium;

    storing data in a lexicon indicating a starting position for the posting list for each association of a term and a semantic role in the hierarchy and data indicating a total size of the posting lists under each node of the role tree;

    retrieving the data from the lexicon identifying the starting position on the physical storage medium for a term at a top of a desired subtree of the hierarchy;

    retrieving the data from the lexicon identifying the total length of posting lists of the desired subtree of the hierarchy; and

    loading a single contiguous block from the starting position on the physical storage medium through the length, the single contiguous block including the posting lists for the desired subtree of the hierarchy.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×