Method and system for determining optimal data layout using blind justice
First Claim
1. A system for improving data locality in a memory, comprising:
- a data reorganizer to copy data objects in different data layouts in the memory;
a profiler for evaluating the performances of the different data layouts; and
a controller to choose one of the data layouts as optimal based on said evaluating, wherein;
said profiler outputs information describing said different performances;
the controller uses said output information to choose the optimal data layout;
the profiler outputs counts of simulated data accesses, cache misses and TLB misses for each of the data layouts; and
the controller is an instrumentation-based controller, calculates costs separately for each of the different data layouts using said output costs from the profiler, and calculates probabilities for each of the data layouts from said costs.
1 Assignment
0 Petitions
Accused Products
Abstract
Disclosed are a method and system for finding an optimal data layout. The approach of the present invention is to try one of several data layouts in the memory, measure the impact of said one data layout on a performance of a program, and decide which of said several data layouts to try next. The trying and measuring steps are repeated, and one of said several data layouts is selected as best or optimal based on the measurings. The preferred embodiment of the invention provides layout auditing, a framework for picking the best data layout online without requiring any user input. Layout auditing optimizes data layouts with a try-measure-decide feedback loop: use a data reorganizer to try one of several data layouts, use a profiler to measure the impact of the data layout on performance, and use a controller to decide which data layout to try next.
-
Citations
5 Claims
-
1. A system for improving data locality in a memory, comprising:
-
a data reorganizer to copy data objects in different data layouts in the memory; a profiler for evaluating the performances of the different data layouts; and a controller to choose one of the data layouts as optimal based on said evaluating, wherein; said profiler outputs information describing said different performances; the controller uses said output information to choose the optimal data layout; the profiler outputs counts of simulated data accesses, cache misses and TLB misses for each of the data layouts; and the controller is an instrumentation-based controller, calculates costs separately for each of the different data layouts using said output costs from the profiler, and calculates probabilities for each of the data layouts from said costs.
-
-
2. A system for improving data locality in a memory accessed by a given program, comprising:
-
a data reorganizer to copy data objects in a plurality of different data layouts in the memory, wherein the given program accesses each of the data layouts while the given program is running; a profiler for evaluating the performance of the program while the program is running and accessing each of the different data layouts; and a controller to choose one of the plurality of data layouts as optimal based on said evaluating; and
, wherein;said profiler outputs information describing said different performances; the controller uses said output information to choose the optimal data layout; the profiler outputs counts of simulated data accesses, cache misses and TLB misses for each of the data layouts; the system further comprises a hardware performance counter that outputs total counts of simulated data accesses, cache misses and TLB misses for all of the data layouts; and the controller is a hardware-based controller, calculates costs separately for each of the different data layouts using said output counts from the profiler and said output total counts from the hardware performance counter, and calculates probabilities for each of the data layouts from said costs.
-
-
3. A method of operating a profiler to measure the performance of different layouts in memory, comprising the steps of dividing memory into areas that can have a plurality of different data layouts, wherein a running program accesses each of the data layouts;
- and measuring the locality for each memory area by using the profiler, while the program is running, for evaluating the performance of the program when accessing each of the data layouts, and wherein;
one of the plurality of data layouts is chosen as optimal based on said evaluating, wherein the measuring step includes the steps of; collecting data reference traces; using the collected traces to drive a cache and TLB simulation; mapping simulated accesses and misses to the data layouts; and keeping a count of each simulated miss mapped to each of the data layouts. - View Dependent Claims (4, 5)
- and measuring the locality for each memory area by using the profiler, while the program is running, for evaluating the performance of the program when accessing each of the data layouts, and wherein;
Specification