Interrupt-based hardware support for profiling memory system performance
First Claim
1. A system for profiling the behavior of memory systems in a processor, comprising:
- an event detector configured to generate a detection signal upon an occurrence of a predetermined data stall time event;
a counter, coupled to said event detector, configured to receive and count occurrences of said detection signal received from said event detector, said counter further configured to generate a pulse signal when a predetermined count value of said detection signal occurrences is reached;
an interrupt generator, coupled to said counter, configured to generate a profiling interrupt signal upon receipt of said pulse signal; and
an interrupt handler residing in the processor, configured to record a state of the processor relating to the memory access data stall time event upon receipt of said profiling interrupt signal generated by said interrupt generator.
3 Assignments
0 Petitions
Accused Products
Abstract
Fueled by higher clock rates and superscalar technologies, growth in processor speed continues to outpace improvement in memory system performance. Reflecting this trend, architects are developing increasingly complex memory hierarchies to mask the speed gap, compiler writers are adding locality enhancing transformations to better utilize complex memory hierarchies, and applications programmers are re-coding their algorithms to exploit memory systems. All of these groups need empirical data on memory behavior to guide their optimizations. This paper describes how to combine simple hardware support and sampling techniques to obtain such data without appreciably perturbing system performance. By augmenting a cache miss counter with a compare register and interrupt line such that the processor is interrupted when the counter matches the compare value, we can sample system state and develop cache miss profiles that associate cache misses with specific processes, procedures, call stacks, addresses, or user defined aspects of system state. This idea is implemented in the Mprof prototype that profiles data stall cycles, first level cache misses, and second level misses on the sun Sparc 10/41. Simple case studies are provided to illustrate Mprof'"'"'s features.
336 Citations
15 Claims
-
1. A system for profiling the behavior of memory systems in a processor, comprising:
-
an event detector configured to generate a detection signal upon an occurrence of a predetermined data stall time event; a counter, coupled to said event detector, configured to receive and count occurrences of said detection signal received from said event detector, said counter further configured to generate a pulse signal when a predetermined count value of said detection signal occurrences is reached; an interrupt generator, coupled to said counter, configured to generate a profiling interrupt signal upon receipt of said pulse signal; and an interrupt handler residing in the processor, configured to record a state of the processor relating to the memory access data stall time event upon receipt of said profiling interrupt signal generated by said interrupt generator. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A performance monitor system for simultaneous profiling of multiple conditions in a processor memory system, comprising:
-
a plurality of condition detectors, each of said plurality of condition detectors configured to monitor a predetermined data stall time condition and to generate a detection signal upon an occurrence of said predetermined condition; a plurality of counters, each of said plurality of counters coupled to a respective one of said plurality of condition detectors, for receiving said detection signal and for generating a count signal indicating the number of said detection signals received; a plurality of loadable compare registers, each of said plurality of loadable compare registers coupled to a respective one of said plurality of counters, configured to receive said count signal from said respective counter and for comparing said count signal to a predetermined value loaded in said each of said plurality of compare registers; an interrupt generator, coupled to each of said plurality of comparators, configured to generate a profiling interrupt signal based upon a predetermined relationship between said pulse signals received from said plurality of comparators; and an interrupt handler, residing in the processor, configured to record a state of the processor relating to the memory access data stall time event upon receipt of said profiling interrupt signal generated by said interrupt generator. - View Dependent Claims (13, 14)
-
-
15. A method for profiling the behavior of memory systems in a processor, comprising:
-
generating a detection signal upon an occurrence of a predetermined data stall time event; counting occurrences of said detection signal received from said event detector, and generating a pulse signal when a predetermined count value of said detection signal occurrences is reached; generating a profiling interrupt signal upon receipt of said pulse signal; and recording a state of the processor relating to the memory access data stall time event upon receipt of said profiling interrupt signal.
-
Specification