Fault-tolerant cache coherence over a lossy network

US 10,467,139 B2
Filed: 12/29/2017
Issued: 11/05/2019
Est. Priority Date: 12/29/2017
Status: Active Grant

First Claim

Patent Images

1. A method, comprising:

storing, in a hardware unit of each node of a plurality of nodes, a respective first plurality of data records for each message sent by said each node, wherein each data record of the respective first plurality of data records comprises;

a message type for said each message, a source node identifier for said each message, a destination node identifier for said each message, route information of said each message between the source node and the destination node of said each message, and a sequence number for said each message;

detecting that a particular message containing a particular sequence number was not received by a first node of the plurality of nodes;

in response to the detecting that the particular message was not received by the first node, sending a Nack message to a second node of the plurality of nodes, wherein the second node is the source node of the particular message, and wherein the Nack message identifies a lost sequence number and the route information for the particular message;

in response to receiving the Nack message at the second node, identifying, from the respective first plurality of data records stored at the second node, a particular data record for the particular message, based on the lost sequence number and the route information for the particular message; and

using the particular data record to process the particular message again.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A cache coherence system manages both internode and intranode cache coherence in a cluster of nodes. Each node in the cluster of nodes is either a collection of processors running an intranode coherence protocol between themselves, or a single processor. A node comprises a plurality of coherence ordering units (COUs) that are hardware circuits configured to manage intranode coherence of caches within the node and/or internode coherence with caches on other nodes in the cluster. Each node contains one or more directories which tracks the state of cache line entries managed by the particular node. Each node may also contain one or more scoreboards for managing the status of ongoing transactions. The internode cache coherence protocol implemented in the COUs may be used to detect and resolve communications errors, such as dropped message packets between nodes, late message delivery at a node, or node failure. Additionally, a transport layer manages communication between the nodes in the cluster, and can additionally be used to detect and resolve communications errors.

50 Citations

18 Claims

1. A method, comprising:
- storing, in a hardware unit of each node of a plurality of nodes, a respective first plurality of data records for each message sent by said each node, wherein each data record of the respective first plurality of data records comprises;
  
  a message type for said each message, a source node identifier for said each message, a destination node identifier for said each message, route information of said each message between the source node and the destination node of said each message, and a sequence number for said each message;
  
  detecting that a particular message containing a particular sequence number was not received by a first node of the plurality of nodes;
  
  in response to the detecting that the particular message was not received by the first node, sending a Nack message to a second node of the plurality of nodes, wherein the second node is the source node of the particular message, and wherein the Nack message identifies a lost sequence number and the route information for the particular message;
  
  in response to receiving the Nack message at the second node, identifying, from the respective first plurality of data records stored at the second node, a particular data record for the particular message, based on the lost sequence number and the route information for the particular message; and
  
  using the particular data record to process the particular message again.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, further comprising:
    - storing, in a hardware unit of each node of the plurality of nodes, a respective second plurality of data records for a last message received by each node, wherein each data record of the respective second plurality of data records comprises route information of the last message, a source identifier for the last message, and the sequence number of the message;
      
      wherein detecting that a particular message containing a particular sequence number was not received by the first node of the plurality of nodes comprises;
      
      receiving a first message from the source node of the particular message, comprising a first sequence number;
      
      comparing the first sequence number with a second sequence number stored in the respective second plurality of data records for the last message received by the first node from the source node of the particular message; and
      
      based on the comparison of the first sequence number and the second sequence number, determining that the particular message containing the particular sequence number was not received from the source node of the particular message.
  - 3. The method of claim 1, wherein using the particular data record to process the particular message again comprises:
    - determining that the message type of the particular message is a Request message; and
      
      in response to determining that the message type of the particular message is a Request message, sending a replay message to a node of the plurality of nodes that holds the particular data record.
  - 4. The method of claim 1, wherein using the particular data record to process the particular message again comprises:
    - determining that the message type of the particular message is a Data message;
      
      in response to determining that the message type of the particular message is a Data message, sending a Replay message to a node of the plurality of nodes that holds the particular data record; and
      
      in response to determining that the message type of the particular message is a Data message, sending a Nack message to the node identified by the destination node identifier for the particular data record.
  - 5. The method of claim 1, wherein using the particular data record to process the particular message again comprises:
    - determining that the message type of the particular message is a DataW message; and
      
      in response to determining that the message type of the particular message is a DataW message, sending a Nack message to a node of the plurality of nodes that holds the particular data record; and
      
      in response to determining that the message type of the particular message is a DataW message, sending a Replay message to the node identified by the destination node identifier for the particular data record.
  - 6. The method of claim 1, wherein using the particular data record to process the particular message again comprises:
    - determining that the message type of the particular message is one of a Ack message, Nack message, or Pull message; and
      
      in response to determining that the message type of the particular message is a Ack message, Nack message, or Pull message, sending a new message to the node identified by the destination node identifier for the particular data record with the same message type as the particular message.

7. A computer system, comprising:
- a plurality of nodes, wherein each node of the plurality of nodes comprises one or more hardware units, wherein each hardware unit of the one or more hardware units comprises one or more processors, registers, content-addressable memories, and/or other computer-implemented hardware circuitry;
  
  wherein each hardware unit of the one or more hardware units is coupled to a particular memory and a particular cache and each particular hardware unit of the one or more hardware units is configured as a cache controller of the particular memory and the particular cache;
  
  each node of the plurality of nodes is configured to;
  
  store, in a first hardware unit of the node, a respective first plurality of data records for each message sent by said each node, wherein each data record of the respective first plurality of data records comprises;
  
  a message type for said each message, a source node identifier for said each message, a destination node identifier for said each message, route information of said each message between the source node and the destination node, and a sequence number for said each message;
  
  a first node of the plurality of nodes configured to;
  
  detect that a particular message containing a particular sequence number was not received by the first node;
  
  in response to the detecting that the particular message was not received by the first node, send a Nack message to a second node of the plurality of nodes, wherein the second node is the source node of the particular message, and wherein the Nack message identifies a lost sequence number and the route information for the particular message;
  
  the second node of the plurality of nodes configured to;
  
  in response to receiving the Nack message at the second node, identify from the respective first plurality of data records stored at the second node, a particular data record for the particular message, based on the lost sequence number and the route information for the particular message; and
  
  use the particular data record to process the particular message again.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The computer system of claim 7, further comprising:
    - each node of the plurality of nodes is configured to;
      
      store, in a hardware unit of each node of the plurality of nodes, a respective second plurality of data records for last message received by said each node, wherein each data record of the respective second plurality of data records comprises route information of the last message, a source identifier for the last message, and the sequence number of the last message;
      
      wherein, the first node is configured to detect that a particular message containing a particular sequence number was not received by the first node of the plurality of nodes by performing steps, comprises the first node being configured to;
      
      receive a first message from the source node of the particular message comprising a first sequence number;
      
      compare the first sequence number with a second sequence number stored in the respective second plurality of data records for the last message received by the first node from the source node of the particular message; and
      
      based on the comparison of the first sequence number and the second sequence number, determine that the particular message containing the particular sequence number was not received from the source node of the particular message.
  - 9. The computer system of claim 7, wherein the second node of the plurality of nodes being configured to use the particular data record to process the particular message again further comprises the second node being configured to:
    - determine that the message type of the particular message is a Request message; and
      
      in response to determining that the message type of the particular message is a Request message, send a replay message to a node of the plurality of nodes that holds the particular data record.
  - 10. The computer system of claim 7, wherein the second node of the plurality of nodes being configured to use the particular data record to process the particular message again further comprises the second node being configured to:
    - determine that the message type of the particular message is a Data message;
      
      in response to determining that the message type of the particular message is a Data message, send a Replay message to a node of the plurality of nodes that holds the particular data record; and
      
      in response to determining that the message type of the particular message is a Data message, send a Nack message to the node identified by the destination node identifier for the particular data record.
  - 11. The computer system of claim 7, wherein the second node of the plurality of nodes being configured to use the particular data record to process the particular message again further comprises the second node being configured to:
    - determine that the message type of the particular message is a DataW message; and
      
      in response to determining that the message type of the particular message is a DataW message, send a Nack message to a node of the plurality of nodes that holds the particular data record; and
      
      in response to determining that the message type of the particular message is a DataW message, send a Replay message to the node identified by the destination node identifier for the particular data record.
  - 12. The computer system of claim 7, wherein the second node of the plurality of nodes being configured to use the particular data record to process the particular message again further comprises the second node being configured to:
    - determine that the message type of the particular message is one of a Ack message, Nack message, or Pull message; and
      
      in response to determining that the message type of the particular message is a Ack message, Nack message, or Pull message, send a new message to the node identified by the destination node identifier for the particular data record with the same message type as the particular message.

13. One or more non-transitory computer-readable storage media storing instructions, which when executed by one or more processors, cause:
- storing, in a hardware unit of each node of a plurality of nodes, a respective first plurality of data records for each message sent by said each node, wherein each data record of the respective first plurality of data records comprises;
  
  a message type for said each message, a source node identifier for said each message, a destination node identifier for said each message, route information of said each message between the source node and the destination node of said each message, and a sequence number for said each message;
  
  detecting that a particular message containing a particular sequence number was not received by a first node of the plurality of nodes;
  
  in response to the detecting that the particular message was not received by the first node, sending a Nack message to a second node of the plurality of nodes, wherein the second node is the source node of the particular message, and wherein the Nack message identifies a lost sequence number and the route information for the particular message;
  
  in response to receiving the Nack message at the second node, identifying, from the respective first plurality of data records stored at the second node, a particular data record for the particular message, based on the lost sequence number and the route information for the particular message; and
  
  using the particular data record to process the particular message again.
- View Dependent Claims (14, 15, 16, 17, 18)
- - 14. The one or more non-transitory computer-readable storage media of claim 13, further comprising instructions, which when executed by the one or more processors, cause:
    - storing, in a hardware unit of each node of the plurality of nodes, a respective second plurality of data records for a last message received by said each node, wherein each data record of the respective second plurality of data records comprises route information of the last message, a source identifier for the last message, and the sequence number of the last message;
      
      wherein detecting that a particular message containing a particular sequence number was not received by the first node of the plurality of nodes comprises;
      
      receiving a first message from the source node of the particular message comprising a first sequence number;
      
      comparing the first sequence number with a second sequence number stored in the respective second plurality of data records for the last message received by the first node from the source node of the particular message; and
      
      based on the comparison of the first sequence number and the second sequence number, determining that the particular message containing the particular sequence number was not received from the source node of the particular message.
  - 15. The one or more non-transitory computer-readable storage media of claim 13, wherein using the particular data record to process the particular message again comprises:
    - determining that the message type of the particular message is a Request message; and
      
      in response to determining that the message type of the particular message is a Request message, sending a replay message to a node of the plurality of nodes that holds the particular data record.
  - 16. The one or more non-transitory computer-readable storage media of claim 13, wherein using the particular data record to process the particular message again comprises:
    - determining that the message type of the particular message is a Data message;
      
      in response to determining that the message type of the particular message is a Data message, sending a Replay message to a node of the plurality of nodes that holds the particular data record; and
      
      in response to determining that the message type of the particular message is a Data message, sending a Nack message to the node identified by the destination node identifier for the particular data record.
  - 17. The one or more non-transitory computer-readable storage media of claim 13, wherein using the particular data record to process the particular message again comprises:
    - determining that the message type of the particular message is a DataW message; and
      
      in response to determining that the message type of the particular message is a DataW message, sending a Nack message to a node of the plurality of nodes that holds the particular data record; and
      
      in response to determining that the message type of the particular message is a DataW message, sending a Replay message to the node identified by the destination node identifier for the particular data record.
  - 18. The one or more non-transitory computer-readable storage media of claim 13, wherein using the particular data record to process the particular message again comprises:
    - determining that the message type of the particular message is one of a Ack message, Nack message, or Pull message; and
      
      in response to determining that the message type of the particular message is a Ack message, Nack message, or Pull message, sending a new message to the node identified by the destination node identifier for the particular data record with the same message type as the particular message.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Oracle International Corporation (Oracle Corporation)
Original Assignee
Oracle International Corporation (Oracle Corporation)
Inventors
Loewenstein, Paul N., Walker, Damien, Mitra, Priyambada, Vahidsafa, Ali, Cohen, Matthew, Ebergen, Josephus, Brock, Andrew
Primary Examiner(s)
Chase, Shelly A

Application Number

US15/859,037
Publication Number

US 20190207714A1
Time in Patent Office

676 Days
Field of Search
US Class Current
CPC Class Codes

G06F 12/0813   with a network or matrix co...

G06F 12/0828   with concurrent directory a...

G06F 12/0842   for multiprocessing or mult...

G06F 2212/1032   Reliability improvement, da...

G06F 2212/154   Networked environment

G06F 2212/60   Details of cache memory

G06F 2212/62   Details of cache specific t...

H04L 1/08   by repeating transmission, ...

H04L 2001/0097   Relays

Fault-tolerant cache coherence over a lossy network

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

50 Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Fault-tolerant cache coherence over a lossy network

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

50 Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links