Performance-aware and reliability-aware data placement for n-level heterogeneous memory systems
First Claim
1. A method for identifying one memory unit, of a plurality of memory units, for storage of a block of data, the method comprising:
- generating failure rates for the plurality of memory units by;
performing a plurality of fault simulations by performing a series of fault simulation iterations, each fault simulation iteration including simulating fault occurrences and error correction, and determining whether error correcting code is not able to correct at least one error,determining a time-to-failure value for each fault simulation by determining the number of fault simulation iterations that occur before an error could not be corrected and an amount of time representative of each fault simulation iteration, anddetermining the failure rates based on the time-to-failure values;
determining, for the block of data, a plurality of costs, each cost corresponding to a different memory unit of the plurality of memory units, based on a comparison of the determined failure rates of the memory units to a combination of hotness values that indicate frequency of access of the block of data and latencies of the memory units;
selecting a cost of the plurality of costs, the selected cost being either the highest of the plurality of costs or the lowest of the plurality of costs; and
migrating the block of data to a memory unit of the plurality of memory units that is associated with the selected cost.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques for selecting one of a plurality of heterogeneous memory units for placement of blocks of data (e.g., memory pages), based on both reliability and performance, are disclosed. A “cost” for each data block/memory unit combination is determined, based on the frequency of access of the data block, the latency of the memory unit, and, optionally, an architectural vulnerability factor (which represents the level of exposure of a particular memory data value to memory faults such as bit flips). A memory unit is selected for the data block for which the determined cost is the lowest, out of all memory units considered, and the data block is placed into that memory unit.
7 Citations
16 Claims
-
1. A method for identifying one memory unit, of a plurality of memory units, for storage of a block of data, the method comprising:
-
generating failure rates for the plurality of memory units by; performing a plurality of fault simulations by performing a series of fault simulation iterations, each fault simulation iteration including simulating fault occurrences and error correction, and determining whether error correcting code is not able to correct at least one error, determining a time-to-failure value for each fault simulation by determining the number of fault simulation iterations that occur before an error could not be corrected and an amount of time representative of each fault simulation iteration, and determining the failure rates based on the time-to-failure values; determining, for the block of data, a plurality of costs, each cost corresponding to a different memory unit of the plurality of memory units, based on a comparison of the determined failure rates of the memory units to a combination of hotness values that indicate frequency of access of the block of data and latencies of the memory units; selecting a cost of the plurality of costs, the selected cost being either the highest of the plurality of costs or the lowest of the plurality of costs; and migrating the block of data to a memory unit of the plurality of memory units that is associated with the selected cost. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer system for identifying one memory unit, of a plurality of memory units, for storage of a block of data, the computer system comprising:
-
a processing unit; a plurality of memory units coupled to the processing unit; a failure-in-time rate logger configured to generate failure rates for the plurality of memory units by; performing a plurality of fault simulations by performing a series of fault simulation iterations, each fault simulation iteration including simulating fault occurrences and error correction, and determining whether error correcting code is not able to correct at least one error, determining a time-to-failure value for each fault simulation by determining the number of fault simulation iterations that occur before an error could not be corrected and an amount of time representative of each fault simulation iteration, and determining the failure rates based on the time-to-failure values; a page placement module configured to; determine, for the block of data, a plurality of costs, each cost corresponding to a different memory unit of the plurality of memory units, based on a comparison of the determined failure rates of the memory units to a combination of hotness values that indicate frequency of access of the block of data and latencies of the memory units, select a cost, of the plurality of costs, the selected cost being either the highest of the plurality of costs or the lowest of the plurality of costs, and migrate the block of data to a memory unit of the plurality of memory units that is associated with the selected cost. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A non-transitory computer-readable medium storing instructions that, when executed by a processor, cause the processor to identify one memory unit, of a plurality of memory units, for storage of a block of data by performing a method comprising:
-
generating failure rates for the plurality of memory units by; performing a plurality of fault simulations by performing a series of fault simulation iterations, each fault simulation iteration including simulating fault occurrences and error correction, and determining whether error correcting code is not able to correct at least one error, determining a time-to-failure value for each fault simulation by determining the number of fault simulation iterations that occur before an error could not be corrected and an amount of time representative of each fault simulation iteration, and determining the failure rates based on the time-to-failure values; determining, for the block of data, a plurality of costs, each cost corresponding to a different memory unit of the plurality of memory units, wherein each determined cost is based on a tradeoff between reliability of a corresponding memory unit of the plurality of memory units and performance of the corresponding memory unit based on a comparison of the determined failure rates of the memory units to a combination of hotness values that indicate frequency of access of the block of data and latencies of the memory units; selecting a cost of the plurality of costs, the selected cost being either the highest of the plurality of costs or the lowest of the plurality of costs; and migrating the block of data to a memory unit of the plurality of memory units that is associated with the selected cost. - View Dependent Claims (16)
-
Specification