Parallelism performance analysis based on execution trace information
First Claim
1. A computer-implemented method for analyzing trace information generated during execution of multiple threads of a software program on a first computer, the first computer having multiple processors that each have multiple protection domains that are each able to execute at least one of the multiple threads, each processor having a counter indicating a number of instruction holes during which an instruction is not executed by the processor, each protection domain having a counter indicating a number of instructions issued in the protection domain by all executing threads, the method comprising:
- receiving an indication of trace information reflecting a series of events that occurred during the execution, each event associated with execution of one of the multiple threads by one of the protection domains of one of the processors and each event having associated values in the trace information of variables maintained by the executing software program, by the one protection domain, and/or by the one processor;
for each of a plurality of periods of time during which the execution was occurring, determining from the trace information a number of instructions executed for the software program during the period of time by identifying multiple protection domains that each executed at least one of the multiple threads during at least a portion of the period of time;
for each of the identified protection domains, determining a change in the value of the issued instructions counter of the protection domain during the period of time;
determining if all of the instructions issued in the protection domain during the period of time were for one of the multiple threads;
when it is determined that all of the instructions issued in the protection domain during the period of time were for one of the multiple threads, calculating a value for the number of instructions executed for the software program during the period of time by the protection domain to be the determined change; and
when it is determined that all of the instructions issued in the protection domain during the period of time were not for one of the multiple threads, calculating a value for the number of instructions executed for the software program during the period of time by the protection domain to be a portion of the determined change that corresponds to a portion of the period of time during which at least one thread for the software program was executing in the protection domain; and
determining the number of instructions executed for the software program during the period of time to be a sum of the calculated values for each of the identified protection domains; and
determining from the trace information a number of instruction slots available for execution of the instructions of software program during the period of time by identifying processors that each executed at least one of the multiple threads during the period of time;
for each of the identified processors, determining a change in the value of the instruction holes counter of the processor during the period of time; and
if all of the instruction holes that occurred during the period of time were attributable to the software program, calculating a value for the number of instruction holes for the processor that are attributable to the software program during the period of time to be the determined change in the value of the instruction holes counter;
calculating a value for the number of instruction holes that are attributable to the software program during the period of time by all of the identified processors to be a sum of the calculated values for each of the identified processors; and
determining the number of instruction slots available for execution of the instructions of software program during the period of time to be a sum of the determined number of instructions executed for the software program during the period of time and of the calculated value for the number of instruction holes that are attributable to the software program during the period of time; and
presenting to a user an indication of the determined number of executed instructions for each of the periods of time and an indication of the determined number of available instruction slots for each of the periods of time, wherein only one software program can execute in a protection domain at any point in time, and wherein when it is determined that all of the instructions issued in a protection domain during a period of time were not for one of the multiple threads, the calculating of a value for the number of instructions executed for the software program during the period of time by the protection domain includes;
determining from the trace information at least one swap event that occurred in the protection domain during the period of time such that the software program is swapped into the protection domain so as to commence execution of the software program or such that the software program is swapped out of the protection domain so as to suspend execution of the software program;
retrieving for each of the determined swap events an associated value in the trace information of the issued instructions counter of the protection domain; and
using the retrieved associated values to calculate the value for the number of instructions executed for the software program during the period of time by the protection domain to include only increments to the issued instructions counter that occurred while the software program is swapped into the protection domain.
2 Assignments
0 Petitions
Accused Products
Abstract
A system for conducting performance analysis for executing tasks. The analysis involves generating a variety of trace information related to performance measures, including parallelism-related information, during execution of the task. In order to generate the trace information, target source code of interest is compiled in such a manner that executing the resulting executable code will generate execution trace information composed of a series of events. Each event stores trace information related to a variety of performance measures for the one or more processors and protection domains used. After the execution trace information has been generated, the system can use that trace information and a trace information description file to produce useful performance measure information. The trace information description file contains information that describes the types of execution events as well as the structure of the stored information. The system uses the trace information description file to organize the information in the trace information file, extracts a variety of types of performance measure information from the organized trace information, and formats the extracted information for display. The system can use default or user-defined functions to extract and format trace information for display. After the system displays one or more types of performance measure information, a user of the system can then interact with the system in a variety of ways to obtain other useful performance analysis information.
186 Citations
69 Claims
-
1. A computer-implemented method for analyzing trace information generated during execution of multiple threads of a software program on a first computer, the first computer having multiple processors that each have multiple protection domains that are each able to execute at least one of the multiple threads, each processor having a counter indicating a number of instruction holes during which an instruction is not executed by the processor, each protection domain having a counter indicating a number of instructions issued in the protection domain by all executing threads, the method comprising:
-
receiving an indication of trace information reflecting a series of events that occurred during the execution, each event associated with execution of one of the multiple threads by one of the protection domains of one of the processors and each event having associated values in the trace information of variables maintained by the executing software program, by the one protection domain, and/or by the one processor;
for each of a plurality of periods of time during which the execution was occurring, determining from the trace information a number of instructions executed for the software program during the period of time by identifying multiple protection domains that each executed at least one of the multiple threads during at least a portion of the period of time;
for each of the identified protection domains, determining a change in the value of the issued instructions counter of the protection domain during the period of time;
determining if all of the instructions issued in the protection domain during the period of time were for one of the multiple threads;
when it is determined that all of the instructions issued in the protection domain during the period of time were for one of the multiple threads, calculating a value for the number of instructions executed for the software program during the period of time by the protection domain to be the determined change; and
when it is determined that all of the instructions issued in the protection domain during the period of time were not for one of the multiple threads, calculating a value for the number of instructions executed for the software program during the period of time by the protection domain to be a portion of the determined change that corresponds to a portion of the period of time during which at least one thread for the software program was executing in the protection domain; and
determining the number of instructions executed for the software program during the period of time to be a sum of the calculated values for each of the identified protection domains; and
determining from the trace information a number of instruction slots available for execution of the instructions of software program during the period of time by identifying processors that each executed at least one of the multiple threads during the period of time;
for each of the identified processors, determining a change in the value of the instruction holes counter of the processor during the period of time; and
if all of the instruction holes that occurred during the period of time were attributable to the software program, calculating a value for the number of instruction holes for the processor that are attributable to the software program during the period of time to be the determined change in the value of the instruction holes counter;
calculating a value for the number of instruction holes that are attributable to the software program during the period of time by all of the identified processors to be a sum of the calculated values for each of the identified processors; and
determining the number of instruction slots available for execution of the instructions of software program during the period of time to be a sum of the determined number of instructions executed for the software program during the period of time and of the calculated value for the number of instruction holes that are attributable to the software program during the period of time; and
presenting to a user an indication of the determined number of executed instructions for each of the periods of time and an indication of the determined number of available instruction slots for each of the periods of time, wherein only one software program can execute in a protection domain at any point in time, and wherein when it is determined that all of the instructions issued in a protection domain during a period of time were not for one of the multiple threads, the calculating of a value for the number of instructions executed for the software program during the period of time by the protection domain includes; determining from the trace information at least one swap event that occurred in the protection domain during the period of time such that the software program is swapped into the protection domain so as to commence execution of the software program or such that the software program is swapped out of the protection domain so as to suspend execution of the software program;
retrieving for each of the determined swap events an associated value in the trace information of the issued instructions counter of the protection domain; and
using the retrieved associated values to calculate the value for the number of instructions executed for the software program during the period of time by the protection domain to include only increments to the issued instructions counter that occurred while the software program is swapped into the protection domain. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51)
-
-
52. A computer-implemented method for analyzing trace information generated during execution of multiple threads of a software program on a first computer, the first computer having multiple processors that each have multiple protection domains that are each able to execute at least one of the multiple threads, each processor having a counter indicating a number of instruction holes during which an instruction is not executed by the processor, each protection domain having a counter indicating a number of instructions issued in the protection domain by all executing threads, the method comprising:
-
receiving an indication of trace information reflecting a series of events that occurred during the execution, each event associated with execution of one of the multiple threads by one of the protection domains of one of the processors and each event having associated values in the trace information of variables maintained by the executing software program, by the one protection domain, and/or by the one processor;
for each of a plurality of periods of time during which the execution was occurring, determining from the trace information a number of instructions executed for the software program during the period of time by identifying multiple protection domains that each executed at least one of the multiple threads during at least a portion of the period of time;
for each of the identified protection domains, determining a change in the value of the issued instructions counter of the protection domain during the period of time;
determining if all of the instructions issued in the protection domain during the period of time were for one of the multiple threads;
when it is determined that all of the instructions issued in the protection domain during the period of time were for one of the multiple threads, calculating a value for the number of instructions executed for the software program during the period of time by the protection domain to be the determined change; and
when it is determined that all of the instructions issued in the protection domain during the period of time were not for one of the multiple threads, calculating a value for the number of instructions executed for the software program during the period of time by the protection domain to be a portion of the determined change that corresponds to a portion of the period of time during which at least one thread for the software program was executing in the protection domain; and
determining the number of instructions executed for the software program during the period of time to be a sum of the calculated values for each of the identified protection domains; and
determining from the trace information a number of instruction slots available for execution of the instructions of software program during the period of time by identifying processors that each executed at least one of the multiple threads during the period of time;
for each of the identified processors, determining a change in the value of the instruction holes counter of the processor during the period of time; and
if all of the instruction holes that occurred during the period of time were attributable to the software program, calculating a value for the number of instruction holes for the processor that are attributable to the software program during the period of time to be the determined change in the value of the instruction holes counter;
calculating a value for the number of instruction holes that are attributable to the software program during the period of time by all of the identified processors to be a sum of the calculated values for each of the identified processors; and
determining the number of instruction slots available for execution of the instructions of software program during the period of time to be a sum of the determined number of instructions executed for the software program during the period of time and of the calculated value for the number of instruction holes that are attributable to the software program during the period of time; and
presenting to a user an indication of the determined number of executed instructions for each of the periods of time and an indication of the determined number of available instruction slots for each of the periods of time, wherein the software program is an executable version of a source code program such that execution of the executable version will generate the indicated trace information and including generating the executable version by compiling the source code program such that a group of instructions is added to the source code program at a location specified by the user, the added instructions when executed to generate trace information for a type of event specified by the user wherein, for at least some of the groups of instructions to be added to the executable program to generate trace information of a specified type, an additional group of instructions is added to the source code program that when executed create a descriptor object for that group of instructions. - View Dependent Claims (53)
-
-
54. A computer-implemented method for analyzing trace information generated during execution of multiple threads of a software program on a first computer, the first computer having multiple processors that each have multiple protection domains that are each able to execute at least one of the multiple threads, each processor having a counter indicating a number of instruction holes during which an instruction is not executed by the processor, each protection domain having a counter indicating a number of instructions issued in the protection domain by all executing threads, the method comprising:
-
receiving an indication of trace information reflecting a series of events that occurred during the execution, each event associated with execution of one of the multiple threads by one of the protection domains of one of the processors and each event having associated values in the trace information of variables maintained by the executing software program, by the one protection domain, and/or by the one processor;
for each of a plurality of periods of time during which the execution was occurring, determining from the trace information a number of instructions executed for the software program during the period of time by identifying multiple protection domains that each executed at least one of the multiple threads during at least a portion of the period of time;
for each of the identified protection domains, determining a change in the value of the issued instructions counter of the protection domain during the period of time;
determining if all of the instructions issued in the protection domain during the period of time were for one of the multiple threads;
when it is determined that all of the instructions issued in the protection domain during the period of time were for one of the multiple threads, calculating a value for the number of instructions executed for the software program during the period of time by the protection domain to be the determined change; and
when it is determined that all of the instructions issued in the protection domain during the period of time were not for one of the multiple threads, calculating a value for the number of instructions executed for the software program during the period of time by the protection domain to be a portion of the determined change that corresponds to a portion of the period of time during which at least one thread for the software program was executing in the protection domain; and
determining the number of instructions executed for the software program during the period of time to be a sum of the calculated values for each of the identified protection domains; and
determining from the trace information a number of instruction slots available for execution of the instructions of software program during the period of time by identifying processors that each executed at least one of the multiple threads during the period of time;
for each of the identified processors, determining a change in the value of the instruction holes counter of the processor during the period of time; and
if all of the instruction holes that occurred during the period of time were attributable to the software program, calculating a value for the number of instruction holes for the processor that are attributable to the software program during the period of time to be the determined change in the value of the instruction holes counter;
calculating a value for the number of instruction holes that are attributable to the software program during the period of time by all of the identified processors to be a sum of the calculated values for each of the identified processor; and
determining the number of instruction slots available for execution of the instructions of software program during the period of time to be a sum of the determined number of instructions executed for the software program during the period of time and of the calculated value for the number of instruction holes that are attributable to the software program during the period of time; and
presenting to a user an indication of the determined number of executed instructions for each of the periods of time and an indication of the determined number of available instruction slots for each of the periods of time wherein during the executing of the executable version, execution of each of the added groups of instructions adds current values of one or more of the variables to the generated trace information and, during the executing of the executable version and before the execution of an added group of instructions, an additional group of added instructions is executed and creates a descriptor object for that added group of instructions.
-
-
55. A method for generating trace information reflecting a series of events that occurred during execution of multiple software threads on a first computer, the method comprising:
-
receiving an indication of a software program for which trace information is to be generated;
receiving indications from a user of one or more locations within the software program at which trace information is to be generated and of one or more types of event each indicating distinct types of trace information; and
automatically producing an executable version corresponding to the software program such that when executed the produced executable version will generate trace information corresponding to multiple events of the types indicated by the user, the producing of the executable version including adding multiple groups of instructions to the software program at the locations specified by the user, each of the added groups of instructions corresponding to one of the types of events indicated by the user such that when executed that added group of instructions will generate the distinct types of trace information for that type of event, the produced executable version having one or more portions which can be executed in parallel with multiple software threads, so that execution of the produced executable version will generate trace information reflecting a series of events that occurred during execution, wherein, for at least some of the groups of instructions added to the executable program, additional instructions are added to the executable program that when executed creates a descriptor object for that group of instructions. - View Dependent Claims (56, 57, 58, 59)
-
-
60. A method for generating trace information reflecting a series of events that occurred during execution of multiple software threads on a first computer, the method comprising:
-
receiving an indication of an executable software program that when executed will generate trace information corresponding to multiple events of specified types, the executable software program including multiple groups of instructions added at specified locations and each corresponding to one of the specified types of events; and
generating the trace information by executing the executable software program using multiple software threads on the first computer in such a manner that each of the added group of instructions are executed at least once, each execution of an added group of instructions corresponding to a specified type of event generating trace information of the type for that type of event, wherein, during the executing of the executable software program and before the execution of an added group of instructions, an additional group of added instructions is executed and creates a descriptor object for that added group of instructions. - View Dependent Claims (61, 62, 63)
-
-
64. A method for analyzing trace information generated during execution of multiple threads of a software program on a first computer, the generated trace information reflecting a series of events that occurred during the execution, the method comprising:
-
receiving an indication of trace information generated during execution of an executable software program using multiple software threads on the first computer, the executable software program including multiple groups of instructions each corresponding to a specified type of event, the execution such that each of the multiple group of instructions are executed at least once and such that each execution of an added group of instructions generates trace information of a type corresponding to the specified type of event for that added group of instructions; and
analyzing the generated trace information to extract execution information corresponding to the events that occurred during the execution, wherein the analyzing of the generated trace information includes using a description of the specified types of events to identify groups of trace information each reflecting execution of a group of added instructions corresponding to those specified types of events and the analyzing of the trace information includes, when information for an event in the generated trace information is separated in multiple non-contiguous portions, creating a mapping to assist in retrieving the information for the event. - View Dependent Claims (65, 66, 67, 68)
-
-
69. A method performed by a computing system for analyzing trace information generated during execution of a software program on a computing system having multiple processors, comprising:
-
receiving an indication of trace information;
determining from the trace information a number of instructions issued by the computing system having multiple processors, the instructions issued when the software program is executed;
determining a number of instruction holes that are attributable to the software program; and
calculating a number of instruction slots available for execution of the software programs by the computing system with multiple processors, the calculating based on the determined number of instructions issued and the number of holes instruction.
-
Specification