Reducing latency for pointer chasing loads
First Claim
Patent Images
1. A processor comprising:
- a load-store unit (LSU);
a store queue; and
a load-store dependency predictor configured to generate a prediction as to whether a load operation is going to hit in the store queue;
wherein the processor is configured to;
determine that a younger memory operation is dependent on an older load operation;
generate, by the load-store dependency predictor, a prediction as to whether the older load operation will hit in the store queue;
in response to the prediction indicating that the older load operation is predicted to hit in the store queue, issue the younger memory operation N clock cycles subsequent to the older load operation, wherein N is a positive integer; and
in response to the prediction indicating that the older load operation is predicted to miss in the store queue, issue the younger memory operation M clock cycles subsequent to the older load operation, wherein M is a positive integer less than N.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems, methods, and apparatuses for reducing the load to load/store address latency in an out-of-order processor. When a producer load is detected in the processor pipeline, the processor predicts whether the producer load is going to hit in the store queue. If the producer load is predicted not to hit in the store queue, then a dependent load or store can be issued early. The result data of the producer load is then bypassed forward from the data cache directly to the address generation unit. This result data is then used to generate an address for the dependent load or store, reducing the latency of the dependent load or store by one clock cycle.
100 Citations
18 Claims
-
1. A processor comprising:
-
a load-store unit (LSU); a store queue; and a load-store dependency predictor configured to generate a prediction as to whether a load operation is going to hit in the store queue; wherein the processor is configured to; determine that a younger memory operation is dependent on an older load operation; generate, by the load-store dependency predictor, a prediction as to whether the older load operation will hit in the store queue; in response to the prediction indicating that the older load operation is predicted to hit in the store queue, issue the younger memory operation N clock cycles subsequent to the older load operation, wherein N is a positive integer; and in response to the prediction indicating that the older load operation is predicted to miss in the store queue, issue the younger memory operation M clock cycles subsequent to the older load operation, wherein M is a positive integer less than N. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method comprising:
-
determining that a younger memory operation is dependent on an older load operation; generating, a load-store dependency predictor, a prediction as to whether the older load operation will hit in a store queue of a load-store unit; in response to the prediction indicating that the older load operation is predicted to hit in the store queue, issuing the younger memory operation N clock cycles subsequent to the older load operation, wherein N is a positive integer; and in response to the prediction indicating that the older load operation is predicted to miss in the store queue, issuing the younger memory operation M clock cycles subsequent to the older load operation, wherein M is a positive integer less than N. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A system comprising:
-
a processor; and a memory; a load-store dependency predictor configured to generate a prediction as to whether a load operation is going to hit in the store queue; wherein the processor is configured to; determine that a younger memory operation is dependent on an older load operation; generate, by the load-store dependency predictor, a prediction as to whether the older load operation will hit in the store queue; in response to the prediction indicating that the older load operation is predicted to hit in the store queue, issue the younger memory operation N clock cycles subsequent to the older load operation, wherein N is a positive integer; and in response to the prediction indicating that the older load operation is predicted to miss in the store queue, issue the younger memory operation M clock cycles subsequent to the older load operation, wherein M is a positive integer less than N. - View Dependent Claims (15, 16, 17, 18)
-
Specification