Efficient storage and retrieval of posting lists
First Claim
1. A computer-implemented method for storing and retrieving one or more posting lists from a physical storage medium, the method comprising performing computer-implemented operations for:
- storing a hierarchy of semantic roles comprising a role tree having nodes related to one another in a strict dominance hierarchy, each of the nodes in the role tree corresponding to a semantic role and being associated with one or more terms;
generating a posting list for each association of a term and a semantic role in the hierarchy, each posting list comprising data identifying one or more documents that include the usage of the term in the associated semantic role;
storing the posting lists contiguously on a physical storage medium by performing a pre-order depth-first traversal of the nodes of the role tree, and at each of the nodes, writing the posting list for a term associated with the semantic role corresponding to the node to the physical storage medium;
storing data in a lexicon indicating a starting position for the posting list for each association of a term and a semantic role in the hierarchy and data indicating a total size of the posting lists under each node of the role tree;
retrieving the data from the lexicon identifying the starting position on the physical storage medium for a term at a top of a desired subtree of the hierarchy;
retrieving the data from the lexicon identifying the total length of posting lists of the desired subtree of the hierarchy; and
loading a single contiguous block from the starting position on the physical storage medium through the length, the single contiguous block including the posting lists for the desired subtree of the hierarchy.
4 Assignments
0 Petitions
Accused Products
Abstract
A role tree having nodes corresponding to semantic roles in a hierarchy is defined. A posting list is generated for each association of a term and a semantic role in the hierarchy. The posting lists are stored contiguously on a physical storage medium such that a subtree of the hierarchy of semantic roles can be loaded from the storage medium as a single contiguous block. The posting lists for a subtree of the hierarchy are retrieved by obtaining data identifying the beginning location on the physical storage medium of the posting lists for the term at the top of a desired subtree of the hierarchy and data identifying the length of the posting lists of the desired subtree of the hierarchy. A single contiguous block that includes the posting lists for the desired subtree of the hierarchy is then retrieved from the beginning location through the specified length.
-
Citations
4 Claims
-
1. A computer-implemented method for storing and retrieving one or more posting lists from a physical storage medium, the method comprising performing computer-implemented operations for:
-
storing a hierarchy of semantic roles comprising a role tree having nodes related to one another in a strict dominance hierarchy, each of the nodes in the role tree corresponding to a semantic role and being associated with one or more terms; generating a posting list for each association of a term and a semantic role in the hierarchy, each posting list comprising data identifying one or more documents that include the usage of the term in the associated semantic role; storing the posting lists contiguously on a physical storage medium by performing a pre-order depth-first traversal of the nodes of the role tree, and at each of the nodes, writing the posting list for a term associated with the semantic role corresponding to the node to the physical storage medium; storing data in a lexicon indicating a starting position for the posting list for each association of a term and a semantic role in the hierarchy and data indicating a total size of the posting lists under each node of the role tree; retrieving the data from the lexicon identifying the starting position on the physical storage medium for a term at a top of a desired subtree of the hierarchy; retrieving the data from the lexicon identifying the total length of posting lists of the desired subtree of the hierarchy; and loading a single contiguous block from the starting position on the physical storage medium through the length, the single contiguous block including the posting lists for the desired subtree of the hierarchy. - View Dependent Claims (2)
-
-
3. A computer system for storing and retrieving one or more posting lists stored on a physical storage medium, the computer system comprising:
-
a processor; and a physical storage medium having computer executable instructions stored thereon which, when executed by the computer, cause the computer to associate one or more semantic roles defined by hierarchy with a term, the hierarchy comprising a role tree having nodes corresponding to the semantic roles that are related to one another in a strict dominance relation, generate a posting list for each association of a term and a semantic role, the posting list comprising data identifying one or more documents that include usage of the term in the associated semantic role, store the posting lists contiguously on the physical storage medium by performing a pre-order depth-first traversal of the nodes of the role tree, and at each of the nodes, writing the posting list for a term associated with the semantic role corresponding the node to the physical storage medium, store data in a lexicon indicating a starting position on the physical storage medium of the posting list for each association of a term and a semantic role in the hierarchy and data indicating a total length of the posting lists under each node of the role tree, retrieve the data from the lexicon that identifies the starting position on the physical storage medium of the posting lists for a term at a top of a desired subtree of a hierarchy, retrieve the data from the lexicon that identifies the total length of the posting lists of the desired subtree of the hierarchy, and to load a single contiguous block from the beginning location through the length from the physical storage medium, the single contiguous block including the posting lists for the desired subtree of the hierarchy.
-
-
4. A computer storage medium having computer executable instructions stored thereon which, when executed by a computer, cause the computer to:
-
associate one or more semantic roles defined by a hierarchy with a term, the hierarchy comprising a role tree having nodes corresponding to the semantic roles that are related to one another in a strict dominance relation; generate a posting list for each association of a term and a semantic role, the posting list comprising data identifying one or more documents that include usage of the term in the associated semantic role; store the posting lists contiguously on the physical storage medium by performing a pre-order depth-first traversal of the nodes of the role tree, and at each of the nodes, writing the posting list for a term associated with the semantic role corresponding the node to the physical storage medium, whereby the posting lists in a subtree of the hierarchy can be loaded from the physical storage medium as a single contiguous block; store data in a lexicon indicating a starting position on the physical storage medium of the posting list for each association of a term and a semantic role in the hierarchy and data indicating a total length of the posting lists under each node of the role tree; retrieve the data from the lexicon identifying the starting position on the physical storage medium for a term at a top of a desired subtree of the hierarchy; retrieve the data from the lexicon identifying the total length of posting lists of the desired subtree of the hierarchy; and
toload a single contiguous block from the starting position on the physical storage medium through the length, the single contiguous block including the posting lists for the desired subtree of the hierarchy.
-
Specification