Memory interleave for heterogeneous computing

US 8,443,147 B2
Filed: 12/05/2011
Issued: 05/14/2013
Est. Priority Date: 08/05/2008
Status: Active Grant

First Claim

Patent Images

1. A method for performing memory interleaving comprising:

performing, in a first level of a two-level interleaving scheme, interleaving across full cache lines of a memory,performing, in a second level of the two-level interleaving scheme, interleaving across sub-cache lines of the memory;

using a prime number of groups of banks for the first level of the two-level interleaving scheme; and

using a prime number of banks within each of said groups of banks for the second level of the two-level interleaving scheme;

wherein said memory interleaving is performed for a system comprising a host processor having a fixed instruction set that defines instructions that the host processor can execute; and

a reconfigurable co-processor comprising reconfigurable logic that is reconfigurable to have any one of a plurality of predefined extended instruction sets for extending the fixed instruction set of the host processor for processing instructions of an executable file, each of said plurality of predefined extended instruction sets defining a plurality of instructions that the reconfigurable co-processor can execute, wherein said plurality of instructions comprise extended instructions that are not natively defined by the fixed instruction set of the host processor.

View all claims

10 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A memory interleave system for providing memory interleave for a heterogeneous computing system is provided. The memory interleave system effectively interleaves memory that is accessed by heterogeneous compute elements in different ways, such as via cache-block accesses by certain compute elements and via non-cache-block accesses by certain other compute elements. The heterogeneous computing system may comprise one or more cache-block oriented compute elements and one or more non-cache-block oriented compute elements that share access to a common main memory. The cache-block oriented compute elements access the memory via cache-block accesses (e.g., 64 bytes, per access), while the non-cache-block oriented compute elements access memory via sub-cache-block accesses (e.g., 8 bytes, per access). A memory interleave system is provided to optimize the interleaving across the system'"'"'s memory banks to minimize hot spots resulting from the cache-block oriented and non-cache-block oriented accesses of the heterogeneous computing system.

Citations

28 Claims

1. A method for performing memory interleaving comprising:
- performing, in a first level of a two-level interleaving scheme, interleaving across full cache lines of a memory,performing, in a second level of the two-level interleaving scheme, interleaving across sub-cache lines of the memory;
  
  using a prime number of groups of banks for the first level of the two-level interleaving scheme; and
  
  using a prime number of banks within each of said groups of banks for the second level of the two-level interleaving scheme;
  
  wherein said memory interleaving is performed for a system comprising a host processor having a fixed instruction set that defines instructions that the host processor can execute; and
  
  a reconfigurable co-processor comprising reconfigurable logic that is reconfigurable to have any one of a plurality of predefined extended instruction sets for extending the fixed instruction set of the host processor for processing instructions of an executable file, each of said plurality of predefined extended instruction sets defining a plurality of instructions that the reconfigurable co-processor can execute, wherein said plurality of instructions comprise extended instructions that are not natively defined by the fixed instruction set of the host processor.
- View Dependent Claims (2, 3)
- - 2. The method of claim 1 wherein said sub-cache lines comprise words within cache lines.
  - 3. The method of claim 1 further comprising:
    - using 31 groups of banks for the first level of the two-level interleaving scheme; and
      
      using 31 banks within each of said 31 groups of banks for the second level of the two-level interleaving scheme.

4. A system comprising:
- a memory;
  
  a plurality of memory controllers for said memory;
  
  a first compute element that issues physical addresses for cache-block oriented access requests to said memory;
  
  a second compute element that issues virtual addresses for sub-cache-block oriented access requests to said memory, wherein said first and second compute elements share a common physical and virtual address space of the memory;
  
  a memory interleave system that receives the physical address for the cache-block oriented access requests issued by the first compute element and receives the virtual addresses for the sub-cache-block oriented access requests issued by the second compute element, and said memory interleave system determines, for each of the received cache-block oriented and sub-cache-block oriented access requests, one of the plurality of memory controllers to direct the access request for interleaving the cache-block oriented and sub-cache-block oriented access requests.
- View Dependent Claims (5, 6, 7, 8)
- - 5. The system of claim 4 wherein the first compute element comprises a host processor, and wherein the second compute element comprises a co-processor.
  - 6. The system of claim 5 wherein the host processor comprises a first instruction set, and wherein said co-processor comprises an extended instruction set for extending the instruction set of the host processor.
  - 7. The system of claim 6 wherein the co-processor is reconfigurable to possess any of a plurality of predefined extended instruction sets.
  - 8. The system of claim 7 wherein the co-processor comprises a field-programmable gate array (FPGA).

9. A system comprising:
- non-sequential access memory;
  
  a cache-access path in which cache-block data is communicated between said non-sequential access memory and a cache memory; and
  
  a direct-access path in which sub-cache-block data is communicated to/from said non-sequential access memory; and
  
  a memory interleave system for interleaving accesses to said non-sequential access memory via the cache-access path and the direct-access path to minimize hot spots within said non-sequential access memory;
  
  wherein said memory interleave system receives a physical address for a cache-block memory access request via the cache-access path, and wherein said memory interleave system receives a virtual address for a sub-cache-block memory access request via the direct-access path; and
  
  wherein the memory interleave system determines said interleaving using the received physical address for the cache-block memory access request and the received virtual address for the sub-cache-block memory access request without requiring the virtual address to first be translated into a physical address.
- View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17)
- - 10. The system of claim 9 further comprising:
    - a host processor having a fixed instruction set that defines instructions that the host processor can execute;
      
      a reconfigurable co-processor comprising reconfigurable logic that is reconfigurable to have any one of a plurality of predefined extended instruction sets for extending the fixed instruction set of the host processor for processing instructions of an executable file, each of said plurality of predefined extended instruction sets defining a plurality of instructions that the reconfigurable co-processor can execute, wherein said plurality of instructions comprise extended instructions that are not natively defined by the fixed instruction set of the host processor; and
      
      said cache memory.
  - 11. The system of claim 10 wherein in said direct-access path said sub-cache-block data is fetched from said non-sequential access memory to said co-processor.
  - 12. The system of claim 10 wherein in said direct-access path said sub-cache-block data is stored to said non-sequential access memory from said co-processor.
  - 13. The system of claim 10 wherein said host processor accesses cache-block data from said non-sequential access memory via said cache-access path;
    - and wherein said co-processor is operable to access said sub-cache-block data from said non-sequential access memory via said direct-access path.
  - 14. The system of claim 10 wherein said co-processor comprises a field-programmable gate array (FPGA).
  - 15. The system of claim 9 wherein said non-sequential access memory comprises:
    - a scatter/gather memory module.
  - 16. The system of claim 9 further comprising:
    - a plurality of memory controllers for said non-sequential access memory, wherein the memory interleave system determines, for each of the accesses to said non-sequential access memory via the cache-access path and the direct-access path, one of the memory controllers to direct the access request to minimize hot spots within said non-sequential access memory.
  - 17. The system of claim 9 wherein said memory interleave system employs a two-level hierarchical interleave scheme for said interleaving, wherein a first level interleaves across cache-block memory accesses received via said cache-access path, and a second level interleaves across sub-cache-block memory accesses received via said direct-access path.

18. A method for performing memory interleaving, said method comprising:
- receiving, by a memory interleave system, a cache-block oriented memory access request from a host processor of a system, said host processor having a fixed instruction set that defines instructions that the host processor can execute;
  
  receiving, by the memory interleave system, a sub-cache-block oriented memory access request from a co-processor of the system, said co-processor comprising reconfigurable logic that is reconfigurable to have any one of a plurality of predefined extended instruction sets for extending the fixed instruction set of the host processor for processing instructions of an executable file, each of said plurality of predefined extended instruction sets defining a plurality of instructions that the co-processor can execute, wherein said plurality of instructions comprise extended instructions that are not natively defined by the fixed instruction set of the host processor; and
  
  performing memory interleaving, by the memory interleave system, for the received cache-block oriented and sub-cache-block oriented memory access requests.
- View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26, 27, 28)
- - 19. The method of claim 18 wherein the cache-block oriented memory access request is a physical address request, and wherein the sub-cache-block oriented memory access request is a virtual address request.
  - 20. The method of claim 19 further comprising:
    - translating a virtual address to a physical address for said physical address request for said cache-block oriented memory access request, wherein said translating said virtual address to said physical address for said physical address request for said cache-block oriented memory access request is performed before said performing said memory interleaving.
  - 21. The method of claim 19 wherein said translating comprises:
    - translating, by said host processor, said virtual address to said physical address prior to sending said cache-block oriented memory access request to said memory interleave system.
  - 22. The method of claim 19 further comprising:
    - translating said virtual address to a physical address for said sub-cache-block oriented memory access request, wherein said translating of said virtual address to said physical address for said sub-cache-block oriented memory access request is performed after said performing said memory interleaving.
  - 23. The method of claim 22 wherein said translating of said virtual address to said physical address for said sub-cache-block oriented memory access request comprises:
    - translating, by one of a plurality of memory controllers in the system to which the memory interleave system sends the virtual address for the sub-cache-block oriented memory access request, said virtual address to said physical address for said sub-cache-block oriented memory access request.
  - 24. The method of claim 23 wherein said performing memory interleaving comprises:
    - determining, by the memory interleave system, which of the plurality of memory controllers in the system to direct the received requests.
  - 25. The method of claim 24 wherein the determining is made, at least in part, to minimize hot spots within the memory.
  - 26. The method of claim 18 wherein said host processor and said co-processor share a common physical and virtual address space of a common memory.
  - 27. The method of claim 26 wherein the received cache-block oriented request and the received sub-cache-block oriented request each request access to the common memory of the system.
  - 28. The method of claim 18 wherein said performing memory interleaving comprises:
    - employing a two-level hierarchical interleave scheme, wherein a first level interleaves across cache-block memory accesses, and a second level interleaves across sub-cache-block memory accesses.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Micron Technology, Inc.
Original Assignee
Convey Computer (Micron Technology, Inc.)
Inventors
Andrewartha, J. Michael, Brewer, Tony M., Magee, Terrell
Primary Examiner(s)
Thai, Tuan V.

Application Number

US13/311,378
Publication Number

US 20120079177A1
Time in Patent Office

526 Days
Field of Search

711/127, 711/100, 711/118, 711/154, 711/157
US Class Current

711/127
CPC Class Codes

G06F 12/0607 Interleaved addressing

Memory interleave for heterogeneous computing

First Claim

10 Assignments

0 Petitions

Accused Products

Abstract

Citations

28 Claims

Specification

Solutions

Use Cases

Quick Links

Memory interleave for heterogeneous computing

First Claim

10 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

28 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links