High-performance, superscalar-based computer system with out-of-order instruction execution and concurrent results distribution
First Claim
1. A superscalar processing system having a plurality of stages, including a first stage for decoding and issuing instructions in a prescribed program order and a second stage for executing instructions out-of-order with respect to said prescribed program order, said superscalar processing system comprising:
- first means for storing a source of operands corresponding to a plurality of instruction operations;
second means for concurrently transferring said operands from said first means to a plurality of functional units;
third means for performing said instruction operations to generate results using said plurality of functional units; and
fourth means for concurrently distributing said results,wherein said first and fourth means including temporary buffer means selectively coupled with register file means, said results are stored in said temporary buffer means rather than said register file means, if said results are distributed out-of-order with respect to said prescribed program order, andsaid register file means directly receives a result of an instruction operation from said fourth means, thereby bypassing said temporary buffer means, if said instruction operation is performed in said prescribed program order.
1 Assignment
0 Petitions
Accused Products
Abstract
A high-performance, superscalar-based computer system with out-of-order instruction execution and concurrent results distribution for enhanced resource utilization and performance throughput. The computer system architecture includes an instruction fetch unit for fetching program instruction sets. Each instruction set includes a plurality of fixed length instructions with a prescribed program order (in-order). The architecture also includes an instruction execution unit for dynamically examining the instruction sets and scheduling instructions for execution, including out-of-order execution, among a plurality of functional units. The data results of the executed instructions are concurrently distributed to a temporary buffer and a register file array and managed by associated control logic, including a register renaming unit, a dependency checker unit, done control unit, and retirement control unit. The architecture also optimizes the scheduling of data paths in accordance with the type of computational function, including integer, floating point, and boolean.
-
Citations
29 Claims
-
1. A superscalar processing system having a plurality of stages, including a first stage for decoding and issuing instructions in a prescribed program order and a second stage for executing instructions out-of-order with respect to said prescribed program order, said superscalar processing system comprising:
-
first means for storing a source of operands corresponding to a plurality of instruction operations; second means for concurrently transferring said operands from said first means to a plurality of functional units; third means for performing said instruction operations to generate results using said plurality of functional units; and fourth means for concurrently distributing said results, wherein said first and fourth means including temporary buffer means selectively coupled with register file means, said results are stored in said temporary buffer means rather than said register file means, if said results are distributed out-of-order with respect to said prescribed program order, and said register file means directly receives a result of an instruction operation from said fourth means, thereby bypassing said temporary buffer means, if said instruction operation is performed in said prescribed program order.
-
-
2. A superscalar processing system having a plurality of stages, including a first stage for decoding and issuing instructions in a prescribed program order and a second stage for executing instructions out-of-order with respect to said prescribed program order, said superscalar processing system comprising:
-
first means for storing a source of operands corresponding to a plurality of instruction operations; second means for concurrently transferring said operands from said first means to a plurality of functional units; third means for performing said instruction operations to generate results using said plurality of functional units; and fourth means for concurrently distributing said results, wherein said first and fourth means including temporary buffer means selectively coupled with register file means, said results are stored in said temporary buffer means rather than said register file means, if said results are distributed out-of-order with respect to said prescribed program order, said first means comprises a floating point register file and an integer register file, and said functional units comprise a floating point functional unit and an integer functional unit, and said fourth means distributes floating point results to said integer file and distributes integer results to said floating point register file. - View Dependent Claims (3, 4, 5)
-
-
6. A superscalar processing system having a plurality of stages, including a first stage for decoding and issuing instructions in a prescribed program order and a second stage for executing instructions out-of-order with respect to said prescribed program order, said superscalar processing system comprising:
-
first means for storing a source of operands corresponding to a plurality of instruction operations; second means for concurrently transferring said operands from said first means to a plurality of functional units; third means for performing said instruction operations to generate results using said plurality of functional units; and fourth means for concurrently distributing said results, said first and fourth means including temporary buffer means selectively coupled with register file means, wherein said results are stored in said temporary buffer means rather than said register file means, if said results are distributed out-of-order with respect to said prescribed program order, said first means comprises a floating point register file, an integer register file, and a boolean register file, and said functional units comprise a floating point functional unit and a integer functional unit, and said fourth means distributes floating point results to one of said integer file and said boolean register file, and distributes integer results to one of said floating point register file and said boolean register file. - View Dependent Claims (7, 8, 9)
-
-
10. A superscalar processing system having a plurality of stages, including a first stage for decoding and issuing instructions in a prescribed program order and a second stage for executing instructions out-of-order with respect to said prescribed program order, said superscalar processing system comprising:
-
first means for storing a source of operands corresponding to a plurality of instruction operations; second means for concurrently transferring said operands from said first means to a plurality of functional units; third means for performing said instruction operations to generate results using said plurality of functional units; and fourth means for concurrently distributing said results, wherein said first and fourth means including temporary buffer means selectively coupled with register file means, said results are stored in said temporary buffer means rather than said register file means, if said results are distributed out-of-order with respect to said prescribed program order, and said second means transfers at least one operand from a floating point register file to an integer functional unit.
-
-
11. A superscalar processing system having a plurality of stages, including a first stage for decoding and issuing instructions in a prescribed program order and a second stage for executing instructions out-of-order with respect to said prescribed program order, said superscalar processing system comprising:
-
first means for storing a source of operands corresponding to a plurality of instruction operations; second means for concurrently transferring said operands from said first means to a plurality of functional units; third means for performing said instruction operations to generate results using said plurality of functional units; and fourth means for concurrently distributing said results, said first and fourth means including temporary buffer means selectively coupled with register file means, wherein said results are stored in said temporary buffer means rather than said register file means, if said results are distributed out-of-order with respect to said prescribed program order, said first means comprises a boolean register file and at least one of a floating point register file and an integer register file, and said functional units comprise a floating point functional unit and an integer functional unit, and said fourth means comprise a set of parallel buses that distributes floating point results to at least one of said integer file and said boolean register file, and distributes integer results to at least one of said floating point register file and said boolean register file. - View Dependent Claims (12, 13, 14)
-
-
15. A method for processing data in a superscalar system having a plurality of stages, including a first stage for decoding and issuing instructions in a prescribed program order and a second stage for executing instructions out-of-order with respect to said prescribed program order, wherein said superscalar system also has a temporary buffer and register file for storing results, the method comprising the steps of:
-
storing a source of operands corresponding to a plurality of instruction operations; concurrently transferring said operands to a plurality of functional units; performing said instruction operations to generate results using said plurality of functional units; concurrently distributing said results; distributing floating point results to an integer file; and distributing integer results to a floating point register file using a set of parallel buses, wherein said storing step is performed by storing said results in the temporary buffer, rather than the register file means, if said results are distributed out-of-order with respect to said prescribed program order. - View Dependent Claims (16, 17)
-
-
18. A method for processing data in a superscalar system having a plurality of stages, including a first stage for decoding and issuing instructions in a prescribed program order and a second stage for executing instructions out-of-order with respect to said prescribed program order, wherein said superscalar system also has a temporary buffer and register file for storing results, the method comprising the steps of:
-
storing a source of operands corresponding to a plurality of instruction operations; concurrently transferring said operands to a plurality of functional units; performing said instruction operations to generate results using said plurality of functional units; concurrently distributing said results; transferring floating point results to one of a integer file and a boolean register file; and transferring integer results to one of said floating point register file and said boolean register file using a set of parallel buses, wherein said storing step is performed by storing said results in the temporary buffer, rather than the register file means, if said results are distributed out-of-order with respect to said prescribed program order. - View Dependent Claims (19, 20)
-
-
21. A computer system, comprising:
-
main memory bus; input/output bus; and a superscalar processor, operably connected to said main memory and said input/output bus, said superscalar processing system having; a first stage for decoding and issuing instructions in a prescribed program order; a second stage for executing instructions out-of-order with respect to said prescribed program order; first means for storing a source of operands corresponding to a plurality of instruction operations; second means for concurrently transferring said operands from said first means to a plurality of functional units; third means for performing said instruction operations to generate results using said plurality of functional units; and fourth means for concurrently distributing said results, wherein said first and fourth means including temporary buffer means selectively coupled with register file means, wherein said results are stored in said temporary buffer means rather than said register file means, if said results are distributed out-of-order with respect to said prescribed program order, said first means comprises a boolean register file and at least one of a floating point register file and an integer register file, and wherein said functional units comprise a floating point functional unit and an integer functional unit, and said fourth means comprise a set of parallel buses that distributes floating point results to at least one of said integer file and said boolean register file, and distributes integer results to at least one of said floating point register file and said boolean register file. - View Dependent Claims (22, 23, 24)
-
-
25. A computer system, comprising:
-
main memory bus; input/output bus; and a superscalar processor, operably connected to said main memory and said input/output bus, said superscalar processing system having; a first stage for decoding and issuing instructions in a prescribed program order; a second stage for executing instructions out-of-order with respect to said prescribed program order; first means for storing a source of operands corresponding to a plurality of instruction operations; second means for concurrently transferring said operands from said first means to a plurality of functional units; third means for performing said instruction operations to generate results using said plurality of functional units; and fourth means for concurrently distributing said results, wherein said first and fourth means include temporary buffer means selectively coupled with register file means, wherein said results are stored in said temporary buffer means rather than said register file means, if said results are distributed out-of-order with respect to said prescribed program order, and said register file means directly receives a result of an instruction operation from said fourth means, thereby bypassing said temporary buffer means, if said instruction operation is performed in said prescribed program order.
-
-
26. A computer system, comprising:
-
main memory bus; input/output bus; and a superscalar processor, operably connected to said main memory and said input/output bus, said superscalar processing system having; a first stage for decoding and issuing instructions in a prescribed program order; a second stage for executing instructions out-of-order with respect to said prescribed program order; first means for storing a source of operands corresponding to a plurality of instruction operations; second means for concurrently transferring said operands from said first means to a plurality of functional units; third means for performing said instruction operations to generate results using said plurality of functional units; and fourth means for concurrently distributing said results, wherein said first and fourth means include temporary buffer means selectively coupled with register file means, wherein said results are stored in said temporary buffer means rather than said register file means, if said results are distributed out-of-order with respect to said prescribed program order, said first means comprises a floating point register file and an integer register file, and said functional units comprise a floating point functional unit and an integer functional unit, and said fourth means distributes floating point results to said integer file and distributes integer results to said floating point register file.
-
-
27. A computer system, comprising:
-
main memory bus; input/output bus; and a superscalar processor, operably connected to said main memory and said input/output bus, said superscalar processing system having; a first stage for decoding and issuing instructions in a prescribed program order; a second stage for executing instructions out-of-order with respect to said prescribed program order; first means for storing a source of operands corresponding to a plurality of instruction operations; second means for concurrently transferring said operands from said first means to a plurality of functional units; third means for performing said instruction operations to generate results using said plurality of functional units; and fourth means for concurrently distributing said results, wherein said first and fourth means include temporary buffer means selectively coupled with register file means, wherein said results are stored in said temporary buffer means rather than said register file means, if said results are distributed out-of-order with respect to said prescribed program order, said first means comprises a floating point register file, an integer register file, and a boolean register file, and said functional units comprise a floating point functional unit and an integer functional unit, and said fourth means distributes floating point results to one of said integer file and said boolean register file, and distributes integer results to one of said floating point register file and said boolean register file.
-
-
28. A computer system, comprising:
-
main memory bus; input/output bus; and a superscalar processor, operably connected to said main memory and said input/output bus, said superscalar processing system having; a first stage for decoding and issuing instructions in a prescribed program order; a second stage for executing instructions out-of-order with respect to said prescribed program order; first means for storing a source of operands corresponding to a plurality of instruction operations; second means for concurrently transferring said operands from said first means to a plurality of functional units; third means for performing said instruction operations to generate results using said plurality of functional units; and fourth means for concurrently distributing said results, wherein said first and fourth means include temporary buffer means selectively coupled with register file means, wherein said results are stored in said temporary buffer means rather than said register file means, if said results are distributed out-of-order with respect to said prescribed program order, and said second means transfers at least one operand from a floating point register file to an integer functional unit.
-
-
29. A method for processing data in a superscalar system having a plurality of stages, including a first stage for decoding and issuing instructions in a prescribed program order and a second stage for executing instructions out-of-order with respect to said prescribed program order, wherein said superscalar system also has a temporary buffer and register file for storing results, the method comprising the steps of:
-
storing a source of operands corresponding to a plurality of instruction operations; concurrently transferring said operands to a plurality of functional units; performing said instruction operations to generate results using said plurality of functional units; concurrently distributing said results; and transferring at least one operand from a floating point register file to an integer functional unit, wherein said storing step is performed by storing said results in the temporary buffer, rather than the register file means, if said results are distributed out-of-order with respect to said prescribed program order.
-
Specification