Method for prefetching recursive data structure traversals
First Claim
1. A method for creating a linked list data structure to which prefetching can be applied in order to minimize the number cache misses endured during traversal, said method comprising the steps of:
- creating a parallel data structure consisting of a plurality of partitions (N) consisting of a plurality of sublists (P), associating a state vector (S) with the data structure to maintain the state of the traversal of each sublist, and maintaining the state of the last sublist to which an element is added in a variable (H), whereby additions are made to the the head of the list by decreasing the list head index to H−
1 modulo P and adding new nodes to the head of the list indexed by the thus updated value of the head index. pipelining the traversal across the N partitions of the data structure;
determining the prefetch distance required in order to traverse said data structure using the aforementioned pipelined traversal, said prefetch distance being determined experimentally by the programmer, computed using prior art, or by the compiler;
inserting prefetch instructions into the traversal loop body.
4 Assignments
0 Petitions
Accused Products
Abstract
Computer systems are typically designed with multiple levels of memory hierarchy. Prefetching has been employed to overcome the latency of fetching data or instructions from or to memory. Prefetching works well for data structures with regular memory access patterns, but less so for data structures such as trees, hash tables, and other structures in which the datum that will be used is not known a priori. In modern transaction processing systems, database servers, operating systems, and other commercial and engineering applications, information is frequently organized in trees, graphs, and linked lists. Lack of spatial locality results in a high probability that a miss will be incurred at each cache in the memory hierarchy. Each cache miss causes the processor to stall while the referenced value is fetched from lower levels of the memory hierarchy. Because this is likely to be the case for a significant fraction of the nodes traversed in the data structure, processor utilization suffers. The inability to compute the address of the next address to be referenced makes prefetching difficult in such applications. The invention allows compilers and/or programmers to restructure data structures and traversals so that pointers are dereferenced in a pipelined manner, thereby making it possible to schedule prefetch operations in a consistent fashion. The present invention significantly increases the cache hit rates of many important data structure traversals, and thereby the potential throughput of the computer system and application in which it is employed. For data structure traversals in which the traversal path may be predetermined, a transformation is performed on the data structure that permits references to nodes that will be traversed in the future be computed sufficiently far in advance to prefetch the data into cache.
86 Citations
16 Claims
-
1. A method for creating a linked list data structure to which prefetching can be applied in order to minimize the number cache misses endured during traversal, said method comprising the steps of:
-
creating a parallel data structure consisting of a plurality of partitions (N) consisting of a plurality of sublists (P), associating a state vector (S) with the data structure to maintain the state of the traversal of each sublist, and maintaining the state of the last sublist to which an element is added in a variable (H), whereby additions are made to the the head of the list by decreasing the list head index to H−
1 modulo P and adding new nodes to the head of the list indexed by the thus updated value of the head index.pipelining the traversal across the N partitions of the data structure;
determining the prefetch distance required in order to traverse said data structure using the aforementioned pipelined traversal, said prefetch distance being determined experimentally by the programmer, computed using prior art, or by the compiler;
inserting prefetch instructions into the traversal loop body. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. The method of traversing a single tree by creating a forest of subtrees by the method comprising the steps of:
-
initiating a level-order traversal starting at the root, maintaining an array of pointers to nodes in the tree in the course of the level-order traversal, discontinuing the level-order traversal when a number of subtrees sufficient for effective software pipelined traversal has been achieved, the aforementioned array of pointers thereby containing the pointers to the roots of the trees of a forest to which software pipelined traversal can be applied, then proceeding with a traversal wherein a tree is constructed as a forest of trees, wherein subtrees pointed to by the aforementioned array of subtrees constitute the forest across which software pipelined traversals are performed. optionally updating the array of pointers to subtrees when additions and deletions to the tree that cause the tree to be modified beyond the addition of a new leaf node (such as rebalancing inherent to a red-black tree). - View Dependent Claims (14, 15, 16)
-
Specification