Tightly coupled accelerator
First Claim
1. Apparatus for processing data under control of a program having program instructions including sequences of individual program instructions corresponding to computational subgraphs within said program, said apparatus comprising:
- an operand store operable to store operand data;
an execution unit coupled to said operand store and responsive to an individual program instruction within said program;
(i) to read one or more input operand values from said operand store;
(ii) to perform a data processing operation specified by said individual program instruction upon said one or more input operand values to generate one or more output operand values; and
(iii) to write said one or more output operand values to said operand store; and
an accelerator unit coupled to said operand store and triggered by reaching an execution point within said program corresponding to a sequence of individual program instructions corresponding to a computational subgraph within said program to apply a selected one of a plurality of predetermined sets of configuration data inputs to said accelerator to configure said accelerator;
(iv) to read one or more input operands from said operand store;
(v) to perform an accelerated data processing operation specified by said sequence of program instructions upon said one or more input operands to generate one or more output operand values and at least one intermediate operand value being an operand value generated by one of said individual program instructions within said sequence of program instructions and determined not to be referenced outside of said sequence of program instructions; and
(vi) to write said one or more output operand values to said operand store with said at least one intermediate operand value not being written to said operand store,wherein said accelerator unit has a plurality of stages each containing one or more primitive operator units with configurable interconnect logic configured to pass operand values between primitive operator units of different stages.
1 Assignment
0 Petitions
Accused Products
Abstract
An accelerator 120 is tightly coupled to the normal execution unit 110. The operand store, which could be a register file 130, a stack based operand store or other operand store is shared by the execution unit and the accelerator unit. Operands may also be accessed as immediate values within the instructions themselves. The sequences of individual program instructions corresponding to computational subgraphs remain within a program but can be recognized by the accelerator as suitable for acceleration and when encountered are executed by the accelerator instead of by the normal execution unit. Within such tightly coupled arrangement problems can arise due to a lack of register resources within the system. The present technique provides that at least some intermediate operand values which are generated within the accelerator, but are determined not to be referenced outside of the computational subgraph concerned, are not written to the operand store.
36 Citations
15 Claims
-
1. Apparatus for processing data under control of a program having program instructions including sequences of individual program instructions corresponding to computational subgraphs within said program, said apparatus comprising:
-
an operand store operable to store operand data; an execution unit coupled to said operand store and responsive to an individual program instruction within said program; (i) to read one or more input operand values from said operand store; (ii) to perform a data processing operation specified by said individual program instruction upon said one or more input operand values to generate one or more output operand values; and (iii) to write said one or more output operand values to said operand store; and an accelerator unit coupled to said operand store and triggered by reaching an execution point within said program corresponding to a sequence of individual program instructions corresponding to a computational subgraph within said program to apply a selected one of a plurality of predetermined sets of configuration data inputs to said accelerator to configure said accelerator; (iv) to read one or more input operands from said operand store; (v) to perform an accelerated data processing operation specified by said sequence of program instructions upon said one or more input operands to generate one or more output operand values and at least one intermediate operand value being an operand value generated by one of said individual program instructions within said sequence of program instructions and determined not to be referenced outside of said sequence of program instructions; and (vi) to write said one or more output operand values to said operand store with said at least one intermediate operand value not being written to said operand store, wherein said accelerator unit has a plurality of stages each containing one or more primitive operator units with configurable interconnect logic configured to pass operand values between primitive operator units of different stages. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A method of processing data under control of a program having program instructions including sequences of individual program instructions corresponding to computational subgraphs within said program, said method comprising:
-
storing operand data within an operand store; in response to an individual program instruction within said program using an execution unit coupled to said operand store; (i) to read one or more input operand values from said operand store; (ii) to perform a data processing operation specified by said individual program instruction upon said one or more input operand values to generate one or more output operand values; and (iii) to write said one or more output operand values to said operand store; and triggered by reaching an execution point within said program corresponding to a sequence of individual program instructions corresponding to a computational subgraph within said program, applying a selected one of a plurality of predetermined sets of configuration data inputs to an accelerator unit coupled to said operand store to control said accelerator unit; (iv) to read one or more input operands from said operand store; (v) to perform an accelerated data processing operation specified by said sequence of program instructions upon said one or more input operands to generate one or more output operand values and at least one intermediate operand value being an operand value generated by one of said individual program instructions within said sequence of program instructions and determined not to be referenced outside of said sequence of program instructions; and (vi) to write said one or more output operand values to said operand store with said at least one intermediate operand value not being written to said operand store, wherein said accelerator unit has a plurality of stages each containing one or more primitive operator units with configurable interconnect logic that passes operand values between primitive operator units of different stages.
-
-
14. Apparatus for processing data under control of a program having program instructions including sequences of individual program instructions corresponding to computational subgraphs within said program, said apparatus comprising:
-
an operand store operable to store operand data; an execution unit coupled to said operand store and responsive to an individual program instruction within said program; (i) to read one or more input operand values from said operand store; (ii) to perform a data processing operation specified by said individual program instruction upon said one or more input operand values to generate one or more output operand values; and (iii) to write said one or more output operand values to said operand store; and an accelerator unit coupled to said operand store and triggered by reaching an execution point within said program corresponding to a sequence of individual program instructions corresponding to a computational subgraph within said program to apply a selected one of a plurality of predetermined sets of configuration data inputs to said accelerator to configure said accelerator; (iv) to read one or more input operands from said operand store; (v) to perform an accelerated data processing operation specified by said sequence of program instructions upon said one or more input operands to generate one or more output operand values and at least one intermediate operand value being an operand value generated by one of said individual program instructions within said sequence of program instructions and determined not to be referenced outside of said sequence of program instructions; and (vi) to write said one or more output operand values to said operand store with said at least one intermediate operand value not being written to said operand store, wherein said accelerator unit has four operand input ports and two operand output ports.
-
-
15. A method of processing data under control of a program having program instructions including sequences of individual program instructions corresponding to computational subgraphs within said program, said method comprising:
-
storing operand data within an operand store; in response to an individual program instruction within said program using an execution unit coupled to said operand store; (i) to read one or more input operand values from said operand store; (ii) to perform a data processing operation specified by said individual program instruction upon said one or more input operand values to generate one or more output operand values; and (iii) to write said one or more output operand values to said operand store; and triggered by reaching an execution point within said program corresponding to a sequence of individual program instructions corresponding to a computational subgraph within said program, applying a selected one of a plurality of predetermined sets of configuration data inputs to an accelerator unit coupled to said operand store to control said accelerator unit; (iv) to read one or more input operands from said operand store; (v) to perform an accelerated data processing operation specified by said sequence of program instructions upon said one or more input operands to generate one or more output operand values and at least one intermediate operand value being an operand value generated by one of said individual program instructions within said sequence of program instructions and determined not to be referenced outside of said sequence of program instructions; and (vi) to write said one or more output operand values to said operand store with said at least one intermediate operand value not being written to said operand store, wherein said accelerator unit has four operand input ports and two operand output ports.
-
Specification