Cache sensitive search (CSS) tree indexing system and method
First Claim
1. A search tree index system for locating a particular key value stored in a sorted array of key values, comprising:
- a computer memory for storing a search tree having a plurality of leaf nodes, each said leaf node containing a plurality of key values, said leaf nodes being capable of referencing said key values stored in said sorted array of key values according to a first offset value; and
a plurality of internal nodes, each said internal node containing a plurality of key values, each said internal node having a plurality of children nodes associated therewith, said children nodes being configured to be referenced by said internal node associated therewith according to a second offset value, said children nodes associated with each said internal node comprising either a plurality of said internal nodes or a plurality of said leaf nodes; and
a computer processor having associated therewith a cache memory having characteristics including a cache size, a cache line size and an associativity level, said computer processor being coupled to said computer memory for computational access to said sorted array of key values, to said plurality of leaf nodes and to said plurality of internal nodes, for determining for said particular key value said second offset value necessary to reference said children nodes from said internal nodes, for determining for said particular key value said first offset value necessary to reference said particular key value in said sorted array of key values from said leaf nodes, and for locating said particular key value in said sorted array of key values, wherein the quantity of internal nodes comprising said plurality of internal nodes and the quantity of leaf nodes comprising said plurality of leaf nodes correspond to said characteristics of said cache memory.
2 Assignments
0 Petitions
Accused Products
Abstract
Cache sensitive search tree (CSS-tree) index structures for providing improved search and lookup performance compared with conventional searching schemes. The CSS-tree index structures include a directory tree structure which is stored in an array (216) and serves as an index for a sorted array of elements. The nodes (215) in the directory tree structure may be of sizes selected to correspond to the cache line size in the computer system utilizing the CSS-tree index structures. Child nodes (213) within the directory tree structure are located by performing arithmetic operations on array offsets. Thus, it is not necessary to store internal child node pointers, thereby reducing memory storage requirements. In addition, the CSS-tree index structures are organized so that traversing each level in the tree yields good data reference locality, and therefore relatively few cache misses. Thus, the CSS-tree index structures consider cache-related parameters such as reference locality and cache behavior, without requiring substantial additional amounts of memory.
109 Citations
44 Claims
-
1. A search tree index system for locating a particular key value stored in a sorted array of key values, comprising:
-
a computer memory for storing a search tree having a plurality of leaf nodes, each said leaf node containing a plurality of key values, said leaf nodes being capable of referencing said key values stored in said sorted array of key values according to a first offset value; and
a plurality of internal nodes, each said internal node containing a plurality of key values, each said internal node having a plurality of children nodes associated therewith, said children nodes being configured to be referenced by said internal node associated therewith according to a second offset value, said children nodes associated with each said internal node comprising either a plurality of said internal nodes or a plurality of said leaf nodes; and
a computer processor having associated therewith a cache memory having characteristics including a cache size, a cache line size and an associativity level, said computer processor being coupled to said computer memory for computational access to said sorted array of key values, to said plurality of leaf nodes and to said plurality of internal nodes, for determining for said particular key value said second offset value necessary to reference said children nodes from said internal nodes, for determining for said particular key value said first offset value necessary to reference said particular key value in said sorted array of key values from said leaf nodes, and for locating said particular key value in said sorted array of key values, wherein the quantity of internal nodes comprising said plurality of internal nodes and the quantity of leaf nodes comprising said plurality of leaf nodes correspond to said characteristics of said cache memory. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A method for locating a particular key value stored in a sorted array of key values, comprising the steps of:
-
generating a plurality of leaf nodes, each said leaf node containing a plurality of key values, said leaf nodes being capable of referencing said key values stored in said sorted array of key values according to a first offset value;
generating a plurality of internal nodes, each said internal node containing a plurality of key values, each said internal node having a plurality of children nodes associated therewith, said children nodes being configured to be referenced by said internal node associated therewith according to a second offset value, said children nodes associated with each said internal node comprising either a plurality of said internal nodes or a plurality of said leaf nodes;
determining for said particular key value said second offset value necessary to reference said children nodes from said internal nodes;
determining for said particular key value said first offset value necessary to reference said particular key value in said sorted array of key values from said leaf nodes; and
locating said particular key value in said sorted array of key values, wherein the quantity of internal nodes comprising said plurality of internal nodes and the quantity of leaf nodes comprising said plurality of leaf nodes are selected according to characteristics of a cache memory associated with a computer processor coupled to said sorted array of key values, said plurality of leaf nodes and said plurality of internal nodes, said characteristics of said cache memory including a cache size, a cache line size and an associativity level. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31)
selecting the quantity of key values comprising said plurality of key values contained in each said leaf node to be equal to the quantity of key values comprising said plurality of key values contained in each said internal node; and
selecting the quantity of children nodes comprising said plurality of children nodes associated with each said internal node to be the same for each said internal node, and also to be equal to one greater than said quantity of key values comprising said plurality of key values contained in each said leaf node and in each said internal node.
-
-
21. The method of claim 20, further comprising the step of storing said plurality of internal nodes and said plurality of leaf nodes in a computer memory in the form of an array.
-
22. The method of claim 21, further comprising the step of selecting said quantity of key values comprising said plurality of key values contained in each said leaf node and in each said internal node to correspond to said cache line size of said cache memory.
-
23. The method of claim 22, further comprising the step of selecting said size of said cache line to be at least as large as said quantity of key values comprising said plurality of key values contained in each said leaf node and in each said internal node.
-
24. The method of claim 21, further comprising the step of said plurality of leaf nodes referencing said key values stored in said sorted array of key values according to an array mapping scheme.
-
25. The method of claim 22, further comprising the step of determining said first offset value and said second offset value according to the quantity of internal nodes comprising said plurality of internal nodes, the quantity of leaf nodes comprising said plurality of leaf nodes, said quantity of key values comprising said plurality of key values contained in each said leaf node and in each said internal node, the quantity of children nodes comprising said plurality of children nodes associated with and configured to be referenced by each said internal node, and the quantity of key values stored in said sorted array of key values.
-
26. The method of claim 19, further comprising the steps of:
-
selecting the quantity of key values comprising said plurality of key values contained in each said leaf node to be equal to the quantity of key values comprising said plurality of key values contained in each said internal node; and
selecting the quantity of children nodes comprising said plurality of children nodes associated with each said internal node to be the same for each said internal node, and also to be equal to said quantity of key values comprising said plurality of key values contained in each said leaf node and in each said internal node.
-
-
27. The method of claim 26, further comprising the step of storing said plurality of internal nodes and said plurality of leaf nodes in a computer memory in the form of an array.
-
28. The method of claim 27, further comprising the step of selecting said quantity of key values comprising said plurality of key values contained in each said leaf node and in each said internal node to correspond to said cache line size of said cache memory.
-
29. The method of claim 28, further comprising the step of selecting said size of said cache line to be at least as large as said quantity of key values comprising said plurality of key values contained in each said leaf node and in each said internal node.
-
30. The method of claim 27, further comprising the step of said plurality of leaf nodes referencing said key values stored in said sorted array of key values according to an array mapping scheme.
-
31. The method of claim 28, further comprising the step of determining said first offset value and said second offset value according to the quantity of internal nodes comprising said plurality of internal nodes, the quantity of leaf nodes comprising said plurality of leaf nodes, said quantity of key values comprising said plurality of key values contained in each said leaf node and in each said internal node, the quantity of children nodes comprising said plurality of children nodes associated with and configured to be referenced by each said internal node;
- and the quantity of key values stored in said sorted array of key values.
-
32. A computer readable media containing a program for use in locating a particular key value stored in a sorted array of key values, said program comprising:
-
a first program routine for forming a search tree having a plurality of leaf nodes, each said leaf node containing a plurality of key values, said leaf nodes being capable of referencing said key values stored in said sorted array of key values according to a first offset value; and
having a plurality of internal nodes, each said internal node containing a plurality of key values, each said internal node having a plurality of children nodes associated therewith, said children nodes being configured to be referenced by said internal node associated therewith according to a second offset value, said children nodes associated with each said internal node comprising either a plurality of said internal nodes or a plurality of said leaf nodes, wherein the quantity of internal nodes comprising said plurality of internal nodes and the quantity of leaf nodes comprising said plurality of leaf nodes correspond to characteristics of a cache memory associated with a computer processor configured to operate on said sorted array of key values, said plurality of leaf nodes and said plurality of internal nodes, said characteristics of said cache memory including a cache size, a cache line size and an associativity level; and
a second program routine for accessing said plurality of internal nodes, said plurality of children nodes associated therewith, said plurality of leaf nodes, and said sorted array of key values;
for determining for said particular key value said second offset value necessary to reference said children nodes from said internal nodes;
for determining for said particular key value said first offset value necessary to reference said particular key value in said sorted array of key values from said leaf nodes; and
for locating said particular key value in said sorted array of key values.- View Dependent Claims (33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44)
-
Specification