Parallelism performance analysis based on execution trace information
First Claim
1. A computer-implemented method for analyzing trace information generated during execution of multiple threads of a software program on a first computer, the first computer having multiple processors that each have multiple protection domains that are each able to execute at least one of the multiple threads, each processor having a counter indicating a number of instruction holes during which an instruction is not executed by the processor, each protection domain having a counter indicating a number of instructions issued in the protection domain by all executing threads, the method comprising:
- receiving an indication of trace information reflecting a series of events that occurred during the execution, each event associated with execution of one of the multiple threads by one of the protection domains of one of the processors and each event having associated values in the trace information of variables maintained by the executing software program, by the one protection domain, and/or by the one processor;
for each of a plurality of periods of time during which the execution was occurring, determining from the trace information a number of instructions executed for the software program during the period of time by identifying multiple protection domains that each executed at least one of the multiple threads during at least a portion of the period of time;
for each of the identified protection domains, determining a change in the value of the issued instructions counter of the protection domain during the period of time;
determining if all of the instructions issued in the protection domain during the period of time were for one of the multiple threads;
when it is determined that all of the instructions issued in the protection domain during the period of time were for one of the multiple threads, calculating a value for the number of instructions executed for the software program during the period of time by the protection domain to be the determined change; and
when it is determined that all of the instructions issued in the protection domain during the period of time were not for one of the multiple threads, calculating a value for the number of instructions executed for the software program during the period of time by the protection domain to be a portion of the determined change that corresponds to a portion of the period of time during which at least one thread for the software program was executing in the protection domain; and
determining the number of instructions executed for the software program during the period of time to be a sum of the calculated values for each of the identified protection domains; and
determining from the trace information a number of instruction slots available for execution of the instructions of software program during the period of time by identifying processors that each executed at least one of the multiple threads during the period of time;
for each of the identified processors, determining a change in the value of the instruction holes counter of the processor during the period of time; and
if all of the instruction holes that occurred during the period of time were attributable to the software program, calculating a value for the number of instruction holes for the processor that are attributable to the software program during the period of time to be the determined change in the value of the instruction holes counter;
calculating a value for the number of instruction holes that are attributable to the software program during the period of time by all of the identified processors to be a sum of the calculated values for each of the identified processors; and
determining the number of instruction slots available for execution of the instructions of software program during the period of time to be a sum of the determined number of instructions executed for the software program during the period of time and of the calculated value for the number of instruction holes that are attributable to the software program during the period of time; and
presenting to a user an indication of the determined number of executed instructions for each of the periods of time and an indication of the determined number of available instruction slots for each of the periods of time.
2 Assignments
0 Petitions
Accused Products
Abstract
A system for conducting performance analysis for executing tasks. The analysis involves generating a variety of trace information related to performance measures, including parallelism-related information, during execution of the task. In order to generate the trace information, target source code of interest is compiled in such a manner that executing the resulting executable code will generate execution trace information composed of a series of events. Each event stores trace information related to a variety of performance measures for the one or more processors and protection domains used. After the execution trace information has been generated, the system can use that trace information and a trace information description file to produce useful performance measure information. The trace information description file contains information that describes the types of execution events as well as the structure of the stored information. The system uses the trace information description file to organize the information in the trace information file, extracts a variety of types of performance measure information from the organized trace information, and formats the extracted information for display. The system can use default or user-defined functions to extract and format trace information for display. After the system displays one or more types of performance measure information, a user of the system can then interact with the system in a variety of ways to obtain other useful performance analysis information.
162 Citations
21 Claims
-
1. A computer-implemented method for analyzing trace information generated during execution of multiple threads of a software program on a first computer, the first computer having multiple processors that each have multiple protection domains that are each able to execute at least one of the multiple threads, each processor having a counter indicating a number of instruction holes during which an instruction is not executed by the processor, each protection domain having a counter indicating a number of instructions issued in the protection domain by all executing threads, the method comprising:
-
receiving an indication of trace information reflecting a series of events that occurred during the execution, each event associated with execution of one of the multiple threads by one of the protection domains of one of the processors and each event having associated values in the trace information of variables maintained by the executing software program, by the one protection domain, and/or by the one processor;
for each of a plurality of periods of time during which the execution was occurring, determining from the trace information a number of instructions executed for the software program during the period of time by identifying multiple protection domains that each executed at least one of the multiple threads during at least a portion of the period of time;
for each of the identified protection domains, determining a change in the value of the issued instructions counter of the protection domain during the period of time;
determining if all of the instructions issued in the protection domain during the period of time were for one of the multiple threads;
when it is determined that all of the instructions issued in the protection domain during the period of time were for one of the multiple threads, calculating a value for the number of instructions executed for the software program during the period of time by the protection domain to be the determined change; and
when it is determined that all of the instructions issued in the protection domain during the period of time were not for one of the multiple threads, calculating a value for the number of instructions executed for the software program during the period of time by the protection domain to be a portion of the determined change that corresponds to a portion of the period of time during which at least one thread for the software program was executing in the protection domain; and
determining the number of instructions executed for the software program during the period of time to be a sum of the calculated values for each of the identified protection domains; and
determining from the trace information a number of instruction slots available for execution of the instructions of software program during the period of time by identifying processors that each executed at least one of the multiple threads during the period of time;
for each of the identified processors, determining a change in the value of the instruction holes counter of the processor during the period of time; and
if all of the instruction holes that occurred during the period of time were attributable to the software program, calculating a value for the number of instruction holes for the processor that are attributable to the software program during the period of time to be the determined change in the value of the instruction holes counter;
calculating a value for the number of instruction holes that are attributable to the software program during the period of time by all of the identified processors to be a sum of the calculated values for each of the identified processors; and
determining the number of instruction slots available for execution of the instructions of software program during the period of time to be a sum of the determined number of instructions executed for the software program during the period of time and of the calculated value for the number of instruction holes that are attributable to the software program during the period of time; and
presenting to a user an indication of the determined number of executed instructions for each of the periods of time and an indication of the determined number of available instruction slots for each of the periods of time. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21-129. -129. (canceled)
Specification