Method and apparatus for facilitating instruction processing of a digital computer
First Claim
1. In a computer, an improvement for facilitating processing of an instruction in a processor having associated therewith a main memory and a cache memory, the cache memory, the cache memory for receiving information units stored in the main memory in order to make said information units more readily available for use by the processor, the improvement comprising:
- cache control means coupled to the processor and to the cache memory for requesting at least one unit of information from the main memory; and
means coupled to receive information units from the main memory for transforming at least a portion of said at least one information unit to produce at least one transformed unit of information and for directing said at least one transformed unit for storage in the cache memory for potential use by the processor, wherein said transforming means comprises means identifying whether said at least one unit is a branch instruction and calculating with said transformation element a branch target address.
2 Assignments
0 Petitions
Accused Products
Abstract
A computer having a cache memory and a main memory is provided with a transformation unit between the main memory and the cache memory so that at least a portion of an information unit retrieved from the main memory may be transformed during retrieval of the information (fetch) from a main memory and prior to storage in the cache memory (cache). In a specific embodiment, an instruction may be predecoded prior to storage in the cache memory. In another embodiment involving a branch instruction, the address of the target of the branch is calculated prior to storing in the instruction cache. The invention has advantages where a particular instruction is repetitively executed since a needed decode operation which has been partially performed previously need not be repeated with each execution of an instruction. Consequently, the latency time of each machine cycle may be reduced, and the overall efficiency of the computing system can be improved. If the architecture defines delayed branch instructions, such branch instructions may be executed in effectively zero machine cycles. This requires a wider bus and an additional register in the processor to allow the fetching of two instructions from the cache memory in the same cycle.
96 Citations
18 Claims
-
1. In a computer, an improvement for facilitating processing of an instruction in a processor having associated therewith a main memory and a cache memory, the cache memory, the cache memory for receiving information units stored in the main memory in order to make said information units more readily available for use by the processor, the improvement comprising:
-
cache control means coupled to the processor and to the cache memory for requesting at least one unit of information from the main memory; and means coupled to receive information units from the main memory for transforming at least a portion of said at least one information unit to produce at least one transformed unit of information and for directing said at least one transformed unit for storage in the cache memory for potential use by the processor, wherein said transforming means comprises means identifying whether said at least one unit is a branch instruction and calculating with said transformation element a branch target address. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. In a computer, a method for facilitating processing of an instruction in a processor having associated therewith a main memory and a cache memory, the cache memory for receiving information units stored in the main memory for order to make said information units more readily available for use by the processor, the method comprising:
- requesting at least one unit of information from the main memory by the processor;
transforming at least a portion of said at least one unit with a transformation element to produce at least one transformed unit of information, wherein said transforming step comprises identifying whether said at least one unit is a branch instruction; and
calculating with said transformation element a branch target address; andstoring said at least one transformed unit in the cache memory for potential use by the processor. - View Dependent Claims (11, 12)
- requesting at least one unit of information from the main memory by the processor;
-
13. In a computer, a method for facilitating processing of an instruction in a processor having associated therewith a main memory and a cache memory, the cache memory for receiving information units stored in the main memory in order to make said information units more readily available for use by the processor, the method comprising:
-
requesting at least one unit of information from the main memory by the processor; transforming at least a portion of said at least one unit with a transformation element to produce at least one transformed unit of information, wherein said information unit is data and wherein said transforming step comprises converting format of data to produce said transformed unit; and storing said at least one transformed unit in the cache memory for potential use by the processor.
-
-
14. In a computer, a method for facilitating processing of an instruction in a processor having associated therewith a main memory and a cache memory, the cache memory for receiving information unit stored in the main memory in order to make said information units more readily available for use by the processor, the method comprising:
-
requesting at least one unit of information to the cache memory from the main memory; transforming at least a portion of said at least one unit with a transformation element to produce at least one transformed unit of information;
thereafterrequesting by the processor a minimum of a first unit of information from the cache memory to the processor and a second unit of information from the cache memory to the processor, said first unit of the information being of the type requiring no further transformation, and wherein said first unit of information and said second unit of information each comprise a separate instruction to said processor, each said instruction being executable by said processor during at least one cycle of said processor, and wherein processing of each said instruction has at least a fetch stage and an execution stage, said execution stage following said fetch stage; fetching by the processor said first instruction and said instruction in a first fetch stage during a first processor cycle; and executing by the processor said second instruction in a first execution stage during a second processor cycle, while during said second processor cycle also fetching by said processor a third instruction in a second fetch stage such that a result is produced that for at least one instruction an effective zero cycle execution time elapses as compared with an instruction which has not undergone said transforming step. - View Dependent Claims (15, 16)
-
-
17. In a computer, a method for facilitating processing of an instruction in a processor having associated therewith a main memory and a cache memory, the cache memory for receiving information units stored in the main memory in order to make said information units more readily available for use by the processor, the method comprising:
-
requesting at least one unit of information to the cache memory from the main memory; transforming at least a portion of said at least one unit with a transformation element to produce at least one transformed unit of information;
thereafterrequesting by the processor a minimum of a first unit of information from the cache memory to the processor and a second unit of information from the cache memory to the processor, said first unit of the information being of the type requiring no further transformation, and wherein said first unit of information and said second unit of information each comprise a separate instruction to said processor, said second instruction being a delay instruction, each said instruction being executable by said processor during at least one cycle of said processor, and wherein procesing of each said isntruction has at least a fetch stage and an execution stage, said execution stage following said fetch stage; fetching by the processor said first instruction and said instruction in a first fetch stage during a first processor cycle;
thenexecuting by the processor said second instruction in a first execution stage during a second processor cycle, while during said second processor cycle also fetching by said processor a target information unit; and
thenfetching by the processor a third instruction relative to an address of said target information unit while executing said target information unit at said target address such that a result is produced that for at least one instruction an effective zero cycle execution time elapses as compared with an instruction which has not undergone said transforming step.
-
-
18. In a computer, a method for facilitating processing of an instruction in a processor having associated therewith a main memory and a cache memory, the cache memory for receiving information units stored in the main memory in order to make said information units more readily available for use by the processor, the method comprising:
-
requesting at least one unit of information to the cache memory from the main memory; transforming at least a portion of said at least one unit with a transformation element to produce at least one transformed unit of information;
thereafterrequesting by the processor a minimum of a first unit of information from the cache memory to th e processor and a second unit of information from the cache memory to the processor, said first unit of the information being of the type requiring no further transformation, and wherein said first unit of information and said second unit of information each comprise a separate instruction to said processor, said first instruction being a branch instruction having a predecoded branch target address and said second instruction being a delay instruction, each said instruction being executable by said processor during at least one cycle of said processor, and wherein processing of each said instruction has at least a fetch stage and an execution stage, said execution stage following said fetch stage; fetching by the processor said first instruction and said second instruction in a first fetch stage during a first processor cycle;
thenexecuting by the processor said second instruction in a first execution stage during a second processor cycle, w hile during said second processor cycle also fetching by said processor a third instruction relative to an address of said target instruction; and
thenexecuting by the processor said target instruction at said target address in said during said second processor cycle while during said second processor cycle fetching a fourth instruction b y said processor such that a result is produced that for at least one instruction an effective zero cycle execution time elapses as compared with an instruction which has not undergone said transforming step.
-
Specification