Fast invalidation scheme for caches
First Claim
1. A method comprising:
- sequentially and individually performing a write back operation for each cache line within a cache memory;
delaying invalidation of each cache line within the cache memory until completion of all write backs required for the cache memory; and
performing invalidation of all cache lines within the cache memory within a single cycle which comprising of;
enabling memory cells within a cache state array of the cache memory; and
writing an invalid state to each memory cell within the cache state array of the cache memory, wherein the enabling of the memory cells within the cache state array of the cache memory occurs during a first phase of a clock cycle and the writing of the invalid state in each memory cell within the cache state array of the cache memory occurs during a second phase of the clock cycle, such that cache line invalidation of each cache line within the cache memory occurs within a single clock cycle formed by the first phase of the clock cycle and the second phase of the clock cycle.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and apparatus for single cycle, cache line invalidation within a cache memory is described. The method includes enabling memory cells within a cache state array of the cache memory. An invalid state is then written to each memory cell within the cache state array of the cache memory. The enabling of the memory cells within the cache state array of the cache memory occurs during a first phase of a clock cycle. While the writing of the invalid state to each memory cell within the cache state array of the cache memory occurs during a second phase of the clock cycle. Consequently, cache line invalidation of each cache line within the cache memory occurs within a single clock cycle formed by the first phase of the clock cycle and the second phase of the clock cycle. In partial invalidation of the cache memory is possible by way-subdividing the cache state array or set-subdividing the cache state array. One shot or single cycle cache line invalidation reduces the total time required for invalidation of all cache lines within the cache memory to just a clock cycle. The implementation is simple with minimal changes to the cache array limited only to those cells that store the state information of the cache lines. Since many system operations necessitate invalidation of the entire cache, one-shot invalidation clearly improves the system performance with no significant impact on the die size.
-
Citations
29 Claims
-
1. A method comprising:
-
sequentially and individually performing a write back operation for each cache line within a cache memory;
delaying invalidation of each cache line within the cache memory until completion of all write backs required for the cache memory; and
performing invalidation of all cache lines within the cache memory within a single cycle which comprising of;
enabling memory cells within a cache state array of the cache memory; and
writing an invalid state to each memory cell within the cache state array of the cache memory, wherein the enabling of the memory cells within the cache state array of the cache memory occurs during a first phase of a clock cycle and the writing of the invalid state in each memory cell within the cache state array of the cache memory occurs during a second phase of the clock cycle, such that cache line invalidation of each cache line within the cache memory occurs within a single clock cycle formed by the first phase of the clock cycle and the second phase of the clock cycle. - View Dependent Claims (2, 3, 4, 5, 6)
modifying decoder logic of the cache state array, such that when an enable signal is active, each word line of each memory cell within the cache state array is selected, thereby write enabling each memory cell within the cache state array during the first phase of the clock cycle.
-
-
3. The method of claim 1, wherein the writing an invalid state to each memory cell within the cache state array further comprises:
placing a low value on a data line electrode joining each memory cell within the cache state, such that an input to an inverter of each memory cell with the cache state array is pulled down wherein a feedback mechanism of each pair of cross-coupled inverters of each memory cell retains the pulled down value thereby invalidating each cache line within the cache state array within a single clock cycle without activating a word line of each memory cell within the cache state array.
-
4. The method of claim 1, wherein the cache memory is a shared cache memory between CPU requests and graphics requests, the method further comprises:
context switching from a first mode to a second mode by writing back all modified cache lines within the cache memory to a main memory prior to the enabling of memory cells within the cache state array of the cache memory.
-
5. The method of claim 1, wherein a decoder of the cache state array is way-divided, such that each set within a given way of the cache memory is invalidated during the single clock cycle.
-
6. The method of claim 1, wherein a decoder of the cache state array is set-divided, such that each way within a given set of the cache memory is invalidated during the single clock cycle.
-
7. A cache memory comprising:
-
a plurality of memory cells coupled together between two data line electrodes to form the cache state array, each memory cell including a pair of cross-coupled inverters coupled within the two data electrodes and a word line by a pair of pass transistors;
a cache state array decoder coupled to each word line of each memory cell within the cache state array by a select line, such that the cache state array decoder write/read enables a desired memory cell within the cache state array; and
a means for invalidating each cache line within the cache memory within a single clock cycle, such that memory cells within the cache state array are enabled during a first phase of the single clock cycle and an invalid state is written to the memory cells within the cache state array during a second phase of the single clock cycle. - View Dependent Claims (8, 9, 10, 11, 12)
a pull down transistor for each memory cell within the cache state array, the pull down transistor coupled to an input of an inverter within the memory cell, such that the input to the inverter of each memory cell with the cache state array is pulled down in response to an enable signal, wherein a feedback mechanism of each pair of cross-couple inverters of each memory cell retains the pulled down value thereby invalidating each cache line within the cache state array within a single clock cycle without activating a word line of each memory cell within the cache state array.
-
-
9. The cache memory of claim 7, wherein the means for invalidating each cache line within the cache memory within the single clock cycle comprises:
an OR-gate for each memory cell within the cache state array, the OR-gate having a first input coupled to an enable single line, a second input coupled to an array select line and an output coupled to a word line of the memory cell, such that when the enable signal line is active, each word line of each memory cell within the cache state array is selected, thereby write enabling each memory cell within the cache state array during the first phase of the clock cycle and writing an invalid state to each memory cell within the cache state array during the second phase of the clock cycle thereby invalidating each cache line within the cache memory within the single clock cycle.
-
10. The cache memory of claim 7, wherein the cache memory is a shared cache memory between CPU requests and graphics engine requests.
-
11. The cache memory of claim 7, wherein the decoder of the cache state array is way-divided, such that each set within a given way of the cache memory is invalidated during the single clock cycle.
-
12. The cache memory of claim 7, wherein the decoder of the cache state array is set-divided, such that each way within a given set of the cache memory is invalidated during the single clock cycle.
-
13. A cache memory comprising:
-
a plurality of memory cells coupled together between two data line electrodes to form the cache state array, each memory cell including a pair of cross-coupled inverters coupled within the two data electrodes and a word line by a pair of pass transistors;
a cache state array decoder coupled to each word line of each memory cell within the cache state array by a select line, such that the cache state array decoder write/read enables a desired memory cell within the cache state array; and
a pull down transistor for each memory cell within the cache state array, the pull down transistor coupled to an input of an inverter within the memory cell, such that the input to the inverter of each memory cell with the cache state array is pulled down in response to an enable signal, wherein a feedback mechanism of each pair of cross-couple inverters of each memory cell retains the pulled down value thereby invalidating each cache line within the cache state array within a single clock cycle without activating a word line of each memory cell within the cache state array. - View Dependent Claims (14, 15)
-
-
16. A cache memory comprising:
-
a plurality of memory cells coupled together between two data line electrodes to form the cache state array, each memory cell including a pair of cross-coupled inverters coupled within the two data electrodes and a word line by a pair of pass transistors;
a cache state array decoder coupled to each word line of each memory cell within the cache state array by a select line, such that the cache state array decoder write/read enables a desired memory cell within the cache state array; and
an OR-gate for each memory cell within the cache state array, the OR-gate having a first input coupled to an enable single line, a second input coupled to an array select line and an output coupled to a word line of the memory cell, such that when the enable signal line is active, each word line of each memory cell within the cache state array is selected, thereby write enabling each memory cell within the cache state array during the first phase of the clock cycle and writing an invalid state to each memory cell within the cache state array during the second phase of the clock cycle thereby invalidating each cache line within the cache memory within the single clock cycle. - View Dependent Claims (17, 18)
-
-
19. An integrated microprocessor system comprising:
-
a bus;
a CPU coupled to the bus;
a graphics engine coupled to the bus a first memory coupled to the bus including a memory controller; and
a single clock cycle invalidation cache memory coupled to the bus the CPU and the graphic engine, the cache memory comprising;
a plurality of memory cell coupled together between two data line electrodes to form a cache state array, each memory cell including a pair of cross coupled inverters coupled within the two data electrodes and a word line by a pair of pass transistors;
a cache state array decoder coupled to each word line of each memory cell within the cache state array by a select line, such that the cache state array decoder write/read enables a desired memory cell within the cache state array; and
a means for invalidating each cache line within the cache memory within a single clock cycle, such that memory cells within the cache state array are enabled during a first phase of the single clock cycle and an invalid state is written to the memory cells within the cache state array during a second phase of the single clock cycle. - View Dependent Claims (20, 21, 22, 23)
a pull down transistor for each memory cell within the cache state array, the pull down transistor coupled to an input of an inverter within the memory cell, such that the input to the inverter of each memory cell with the cache state array is pulled down in response to an enable signal, wherein a feedback mechanism of each pair of cross-couple inverters of each memory cell retains the pulled down value thereby invalidating each cache line within the cache state array within a single clock cycle without activating a word line of each memory cell within the cache state array.
-
-
21. The integrated microprocessor system of claim 19, wherein the means for invalidating each cache line within the cache memory within the single clock cycle comprises:
an OR-gate for each memory cell within the cache state array, the OR-gate having a first input coupled to an enable single line, a second input coupled to an array select line and an output coupled to a word line of the memory cell, such that when the enable signal line is active, each word line of each memory cell within the cache state array is selected, thereby write enabling each memory cell within the cache state array during the first phase of the clock cycle and writing an invalid state to each memory cell within the cache state array during the second phase of the clock cycle thereby invalidating each cache line within the cache memory within the single clock cycle.
-
22. The integrated microprocessor system of claim 21, wherein the decoder of the cache state array is way-divided, such that each set within a given way of the cache memory is invalidated during a single clock cycle.
-
23. The integrated microprocessor system of claim 21, wherein the decoder of the cache state array is set-divided, such that each way within a given set of the cache memory is invalidated during a single clock cycle.
-
24. An integrated microprocessor system comprising:
-
a bus;
a CPU coupled to the bus;
a graphics engine coupled to the bus a first memory coupled to the bus including a memory controller; and
a single clock cycle invalidation cache memory coupled to the bus the CPU and the graphic engine, the cache memory comprising;
a plurality of memory cell coupled together between two data line electrodes to form a cache state array, each memory cell including a pair of cross coupled inverters coupled within the two data electrodes and a word line by a pair of pass transistors;
a cache state array decoder coupled to each word line of each memory cell within the cache state array by a select line, such that the cache state array decoder write/read enables a desired memory cell within the cache state array; and
an OR-gate for each memory cell within the cache state array, the OR-gate having a first input coupled to an enable single line, a second input coupled to an array select line and an output coupled to a word line of the memory cell, such that when the enable signal line is active, each word line of each memory cell within the cache state array is selected, thereby write enabling each memory cell within the cache state array during the first phase of the clock cycle and writing an invalid state to each memory cell within the cache state array during the second phase of the clock cycle thereby invalidating each cache line within the-cache memory within the single clock cycle. - View Dependent Claims (25, 26)
-
-
27. An integrated microprocessor system comprising:
-
a bus;
a CPU coupled to the bus;
a graphics engine coupled to the bus a first memory coupled to the bus including a memory controller; and
a single clock cycle invalidation cache memory coupled to the bus the CPU and the graphic engine, the cache memory comprising;
a plurality of memory cell coupled together between two data line electrodes to form a cache state array, each memory cell including a pair of cross coupled inverters coupled within the two data electrodes and a word line by a pair of pass transistors;
a cache state array decoder coupled to each word line of each memory cell within the cache state array by a select line, such that the cache state array decoder write/read enables a desired memory cell within the cache state array; and
a pull down transistor for each memory cell within the cache state array, the pull down transistor coupled to an input of an inverter within the memory cell, such that the input to the inverter of each memory cell with the cache state array is pulled down in response to an enable signal, wherein a feedback mechanism of each pair of cross-couple inverters of each memory cell retains the pulled down value thereby invalidating each cache line within the cache state array within a single clock cycle without activating a word line of each memory cell within the cache state array. - View Dependent Claims (28, 29)
-
Specification