Loop code processor optimizations
First Claim
Patent Images
1. A method comprising:
- detecting, within a processor, a code loop that includes one or more zero-optimizable instructions;
generating, based on a first condition that a first input has a value of zero and is stored in a cache line that includes at least one non-zero value, a first optimized code corresponding to the code loop; and
generating, based on a second condition that a second input has a value of zero and is clustered with other zero values in a cache line, a second optimized code corresponding to the code loop.
1 Assignment
0 Petitions
Accused Products
Abstract
Loop code processor optimizations are implemented as a loop optimizer extension to a processor pipeline. The loop optimizer generates optimized code associated with code loops that include at least one zero-optimizable instruction. The loop optimizer may generate multiple versions of optimized code associated with a particular code loop, where each of the multiple version of optimized code has a different associated condition under which the optimized code can be safely executed.
27 Citations
17 Claims
-
1. A method comprising:
-
detecting, within a processor, a code loop that includes one or more zero-optimizable instructions; generating, based on a first condition that a first input has a value of zero and is stored in a cache line that includes at least one non-zero value, a first optimized code corresponding to the code loop; and generating, based on a second condition that a second input has a value of zero and is clustered with other zero values in a cache line, a second optimized code corresponding to the code loop. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A device comprising:
-
a processor, wherein the processor includes a loop optimizer, the loop optimizer configured to; identify a code loop being processed by the processor; and generate, based on a condition that a first input has a value of zero and is stored in a cache line that includes at least one non-zero value, a first optimized code corresponding to the code loop; generate, based on a condition that a second input has a value of zero and is clustered with other zero values in a cache line, a second optimized code corresponding to the code loop; a cache system communicatively coupled to the processor, the cache system including; an instruction cache for storing the code loop; and a zero optimized cache for storing the first and second optimized code. - View Dependent Claims (13, 14, 15)
-
-
16. A processor configured to process instructions according to a processor pipeline, wherein the processor pipeline comprises:
-
a stage to fetch an instruction from a memory; a stage to execute the instruction; and a loop optimizer configured to; detect a code loop that includes a zero-optimizable instruction; generate, based on a first condition that a first input has a value of zero and is stored in a cache line that includes at least one non-zero value, a first optimized code corresponding to the code loop; and generate, based on a second condition that a second input has a value of zero and is clustered with other zero values in a cache line, a second optimized code corresponding to the code loop. - View Dependent Claims (17)
-
Specification