Leverage offload programming model for local checkpoints

US 10,089,197 B2
Filed: 12/16/2014
Issued: 10/02/2018
Est. Priority Date: 12/16/2014
Status: Active Grant

First Claim

Patent Images

1. A method implemented in a computing environment including a compute entity comprising a source commutatively coupled to a plurality of compute entities comprising sinks, the method comprising:

managing, using the source, execution of a job comprising executable code;

employing the source to execute the job;

detecting, during execution of the job, sections of the executable code to be offloaded to sinks, each comprising a respective code section including a one or more functions to be offloaded to a sink;

constructing, for each code section to be offloaded to a sink, offload context information identifying one of the code section or indicia identifying the one or more functions and information identifying the sink;

offloading the code sections to the plurality of sinks;

storing, for each code section that is offloaded to a sink, the offload context information constructed for that code section;

receiving, for offloaded code sections, results generated by the sinks to which the code sections were offloaded;

detecting that a sink has failed to successfully execute a code section that was offloaded to the sink, and in response thereto,retrieving the offload context information corresponding to the code section offloaded to the sink; and

offloading the code section to another sink for execution.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, apparatus, and systems for leveraging an offload programming model for local checkpoints. Compute entities in a computing environment are implemented as one or more sources and a larger number of sinks. A job dispatcher dispatches jobs comprising executable code to the source(s), and the execution of the job code is managed by the source(s). Code sections in the job code designated for offload are offloaded to the sinks by creating offload context information. In conjunction with each offload, an offload object is generated and written to storage. The offloaded code sections are executed by the sinks, which return result data to the source, e.g., via a direct write to a memory buffer specified in the offload context information. The health of the sinks is monitored to detect failures, and upon a failure the source retrieves the offload object corresponding to the code section offloaded to the failed sink, regenerates the offload context information for the code section and sends this to another sink for execution.

17 Citations

View as Search Results

25 Claims

1. A method implemented in a computing environment including a compute entity comprising a source commutatively coupled to a plurality of compute entities comprising sinks, the method comprising:
- managing, using the source, execution of a job comprising executable code;
  
  employing the source to execute the job;
  
  detecting, during execution of the job, sections of the executable code to be offloaded to sinks, each comprising a respective code section including a one or more functions to be offloaded to a sink;
  
  constructing, for each code section to be offloaded to a sink, offload context information identifying one of the code section or indicia identifying the one or more functions and information identifying the sink;
  
  offloading the code sections to the plurality of sinks;
  
  storing, for each code section that is offloaded to a sink, the offload context information constructed for that code section;
  
  receiving, for offloaded code sections, results generated by the sinks to which the code sections were offloaded;
  
  detecting that a sink has failed to successfully execute a code section that was offloaded to the sink, and in response thereto,retrieving the offload context information corresponding to the code section offloaded to the sink; and
  
  offloading the code section to another sink for execution.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The method of claim 1, further comprising:
    - detecting, during execution of the job, a first code section including a one or more first functions to be offloaded to a sink;
      
      constructing a first offload context including an address of a first sink, one of the first code section or indicia identifying the one or more first functions, and one or more function parameters for the one or more first functions;
      
      sending the first offload context to the first sink;
      
      storing a first offload context object corresponding to the first offload context in non-volatile storage;
      
      receiving a first function execution result data produced by the first sink upon execution of the one or more first functions by the first sink; and
      
      continuing execution of the job by the source as if the source executed the one or more first functions.
  - 3. The method of claim 2, further comprising:
    - detecting, during execution of the job, a second code section including a one or more second functions to be offloaded to a sink;
      
      constructing a second offload context including an address of a second sink, one of the second code section or indicia identifying the one or more section functions, and one or more function parameters for the one or more second functions;
      
      sending the second offload context to the second sink;
      
      storing a second offload context object corresponding to the second offload context in non-volatile storage;
      
      receiving information identifying the second sink has failed, detecting the second sink has failed, or detecting the second sink has failed to complete execution of the second code section, and in response thereto,retrieving the second offload context object from non-volatile storage;
      
      employing the second offload context object to construct a third offload context including an address of a third sink, the second code section or indicia identifying the second code section, and the one or more function parameters for the one or more second functions;
      
      sending the third offload context to the third sink;
      
      storing a third offload context object corresponding to the third offload context in non-volatile storage;
      
      receiving a second function execution result data produced by the third sink upon execution of the one or more second functions by the third sink; and
      
      continuing execution of the job by the source as if the source executed the one or more second functions.
  - 4. The method of claim 2, wherein the first offload context includes information identifying a memory buffer to which the first function result is to be written, and wherein the sink writes the first function result to the memory buffer.
  - 5. The method of claim 4, further comprising:
    - setting up a Remote Direct Memory Access (RDMA) mechanism between the source and the first sink; and
      
      employing an RDMA write to write the first function result directly to the memory buffer.
  - 6. The method of claim 2, further comprising:
    - distributing a library containing the one or more first functions to the first sink or a host device in which the first sink is implemented in advance of executing the job; and
      
      including indicia in the first offload context identifying what library functions to execute.
  - 7. The method of claim 1, wherein detecting that a sink has failed to successfully execute a code section that was offloaded to the sink is detected by using a heartbeat monitoring scheme to determine a sink has failed.
  - 8. The method of claim 1, wherein detecting that a sink has failed to successfully execute a code section that was offloaded to the sink is detected by using a timeout timer.
  - 9. The method of claim 1, wherein the source comprises a host processor in a server platform and at least a portion of the sinks comprise processor cores in a many integrated core (MIC) device installed in server platform.
  - 10. The method of claim 1, wherein the source comprises a server platform in which multiple many integrated core (MIC) devices are installed, and the sinks comprise processor cores in the multiple MIC devices.
  - 11. The method of claim 1, wherein the compute entities corresponding to each of the source and the sinks comprise at least one of server blades and server modules in one or more sleds or chassis in a rack.

12. A server platform comprising:
- a host processor coupled to host memory;
  
  a plurality of expansion slots, communicatively-coupled to the host processor;
  
  one or more many integrated core (MIC) devices installed in respective expansion slots, each MIC device including a plurality of processor cores and on-board memory; and
  
  a network adaptor, installed in either an expansion slot or implemented as a component that is communicatively-coupled to the host processor;
  
  wherein the server platform further includes software instructions configured to be executed on the host processor and a plurality of the processor cores in the MIC device to enable the server platform to;
  
  configure the host processor as a source and at least a portion of the plurality of processor cores in the MIC device as sinks;
  
  configure memory mappings between the on-board MIC memory and the host memory;
  
  manage execution of a job comprising executable code on the host processor;
  
  employ the source to execute the job;
  
  detect, during execution of the job, sections of the executable code to be offloaded to sinks, each comprising a respective code section including a one or more functions to be offloaded to a sink;
  
  construct, for each code section to be offloaded to a sink, offload context information identifying one of the code section or indicia identifying the one or more functions and information identifying the sink;
  
  offload the code sections to the plurality of sinks;
  
  transmit for storage on a non-volatile storage device accessible via a network coupled to the network adaptor, for each code section that is offloaded to a sink, offload context information identifying the code section that is offloaded and the sink it is offloaded to;
  
  execute the offloaded code section on the sinks to generate result data;
  
  store the result data in memory buffers accessible to the host processor;
  
  detect that a sink has failed to successfully execute a code section that was offloaded to the sink, and in response thereto,retrieve the offload context information corresponding to the code section offloaded to the sink that was previously stored; and
  
  offload the of code section to another sink for execution.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21)
- - 13. The server platform of claim 12, wherein execution of the instructions further enables the server platform to:
    - execute a first portion of the job via the source;
      
      detect, during execution of the job, a first code section to be offloaded including one or more first functions;
      
      identify a first sink to offload the first code section to;
      
      construct a first offload context including an address of the first sink, one of the first code section or indicia identifying the one or more first functions, one or more function parameters for the one or more first functions, and information mapping a memory buffer to which result data generated via execution of the one or more first functions is to be written;
      
      transfer the first offload context to the first sink;
      
      store a first offload context object corresponding to the first offload context in non-volatile storage;
      
      execute, via the first sink;
      
      the one or more first functions using the one or more function parameters to generate result data;
      
      write the result data to the memory buffer; and
      
      continue execution of the job by the source as if the source executed the first function.
  - 14. The server platform of claim 13, wherein execution of the instructions further enables the server platform:
    - detect, during execution of the job, a second code section to be offloaded including a one or more second functions;
      
      identify a second sink to offload the second code section to;
      
      construct a second offload context including an address of the second sink, one of the second code section or indicia identifying the one or more second functions, one or more function parameters for the one or more second functions, and information mapping a memory buffer to which result data generated via execution of the one or more second functions is to be written;
      
      send the second offload context to the second sink;
      
      store a second offload context object corresponding to the second offload context in non-volatile storage;
      
      detecting the second sink has failed, or detecting the second sink has failed to complete execution of the second code section, and in response theretoone of receive information identifying the second sink has failed, detect the second sink has failed or detect execution of the second code section on the second sink has resulted in an error, and in response thereto,retrieve the second offload context object from non-volatile storage;
      
      employ the second offload context object to construct a third offload context including an address of a third sink, one of the second code section or indicia identifying the one or more second functions, one or more function parameters for the one or more second functions, and information mapping a memory buffer to which result data generated via execution of the one or more second functions is to be written;
      
      sending the third offload context to the third sink;
      
      storing a third offload context object corresponding to the third offload context in non-volatile storage;
      
      execute, via the third sink;
      
      the one or more section functions using the one or more function parameters to generate result data;
      
      write the result data to the memory buffer; and
      
      continue execution of the job by the source as if the source executed the one or more second functions.
  - 15. The system of claim 14, wherein the system is further configured to implement a heartbeat monitor scheme to detect failure of a sink.
  - 16. The system of claim 14, wherein the system is further configured to implement a timeout timer to detect execution of the second code section on the second sink has resulted in an error.
  - 17. The server platform of claim 12, wherein the server platform comprises a dual-socket server including first and second sockets, wherein each socket includes a respective host processor coupled to respective host memory and at least one expansion slot communicatively-coupled to the host processor, and wherein the host processor and host memory of claim 12 comprises a first host processor and first host memory in the first socket.
  - 18. The server platform of claim 17, wherein each of the first and second sockets include one or more expansion slots in which a respective MIC device is installed, and wherein the sinks in the processor cores of the MIC devices installed in the expansion slots in the first and second sockets are implemented as a single domain.
  - 19. The server platform of claim 17, wherein each of the first and second sockets include one or more expansion slots in which a respective MIC device is installed, and wherein each socket further includes software instructions configured to be executed on the host processor of that socket and a plurality of the processor cores in one or more MIC devices installed in one or more respective expansion slots for the socket to enable each socket to:
    - configure the host processor in the socket as a source and at least a portion of the plurality of processor cores in the MIC device as sinks;
      
      configure, for each MIC device installed an expansion slot for the socket, memory mappings between the on-board MIC memory and the host memory;
      
      manage execution of a job comprising executable code on the host processor;
      
      offload the code sections to the plurality of sinks;
      
      transmit for storage on a non-volatile storage device accessible via a network coupled to the network adaptor, for each code section that is offloaded to a sink, offload context information identifying the code section that is offloaded and the sink it is offloaded to;
      
      execute the offloaded code section on the sinks to generate result data;
      
      store the result data in memory buffers accessible to the host processor;
      
      detect that a sink has failed to successfully execute a code section that was offloaded to the sink, and in response thereto,retrieve the offload context information corresponding to the code section offloaded to the sink that was previously stored; and
      
      offload the code section to another sink for execution.
  - 20. The server platform of claim 17, wherein the sinks corresponding to the processor cores of the one or more MIC devices installed in the expansion slots of the first socket are implemented in a first domain managed by the first source, and wherein the sinks corresponding to the processor cores of the one or more MIC devices installed in the expansion slots of the second socket are implemented in a second domain managed by the second source.
  - 21. The server platform of claim 20, wherein execution of the software instructions on at least one of the first and second host processors enables the server platform to perform a checkpoint operations under which state information corresponding to respective jobs being executed in parallel on the first and second sockets is written to non-volatile storage accessed via the network adapter.

22. At least one tangible non-transitory machine-readable medium having instructions stored thereon configured to be executed by compute entities in a server platform including,a host processor comprising a first compute entity;
- host memory coupled to the host processor;
  
  a plurality of expansion slots, communicatively-coupled to the host processor;
  
  one or more many integrated core (MIC) devices installed in respective expansion slots, each MIC device including a plurality of processor cores comprising compute entities and on-board memory; and
  
  a network adaptor, installed in either an expansion slot or implemented as a component that is communicatively-coupled to the host processor;
  
  wherein execution of the instructions by the host processor and processor cores in the one or more MIC devices enable the server platform to;
  
  configure the host processor as a source and at least a portion of the plurality of processor cores in the one or more MIC devices as sinks;
  
  configure, for each MIC device, memory mappings between the on-board MIC memory of the MIC device and the host memory;
  
  manage execution of a job comprising executable code on the host processor;
  
  employ the source to execute the job;
  
  detect, during execution of the job, sections of the executable code to be offloaded to sinks, each comprising a respective code section including a one or more functions to be offloaded to a sink;
  
  construct, for each code section to be offloaded to a sink, offload context information identifying one of the code section or indicia identifying the one or more functions and information identifying the sink;
  
  offload the code sections to the plurality of sinks;
  
  transmit for storage on a non-volatile storage device accessible via a network coupled to the network adaptor, for each code section that is offloaded to a sink, offload context information identifying the code section that is offloaded and the sink it is offloaded to;
  
  execute the offloaded code section on the sinks to generate result data;
  
  store the result data in memory buffers accessible to the host processor;
  
  detect that a sink has failed to successfully execute a code section that was offloaded to the sink, and in response thereto,retrieve the offload context information corresponding to the code section offloaded to the sink that was previously stored; and
  
  offload the code section to another sink for execution.
- View Dependent Claims (23, 24, 25)
- - 23. The at least one tangible non-transitory machine-readable medium of claim 22, wherein execution of the instructions by the host processor and processor cores in the one or more MIC devices further enable the server platform to:
    - execute a first portion of the job via the source;
      
      detect, during execution of the job, a first code section to be offloaded including a one or more first functions;
      
      identify a first sink to offload the first code section to;
      
      construct a first offload context including an address of the first sink, one of the first code section or indicia identifying the one or more first functions, one or more function parameters for the one or more first functions, and information mapping a memory buffer to which result data generated via execution of the one or more first functions is to be written;
      
      transfer the first offload context to the first sink;
      
      store a first offload context object corresponding to the first offload context in non-volatile storage;
      
      execute, via the first sink;
      
      the one or more first functions using the one or more function parameters to generate result data;
      
      write the result data to the memory buffer; and
      
      continue execution of the job by the source as if the source executed the first function.
  - 24. The at least one tangible non-transitory machine-readable medium of claim 23, wherein execution of the instructions by the host processor and processor cores in the one or more MIC devices further enable the server platform to:
    - detect, during execution of the job, a second code section to be offloaded including a second function;
      
      identify a second sink to offload the second code section to;
      
      construct a second offload context including an address of the second sink, one of the second code section or indicia identifying the one or more second functions, one or more function parameters for the one or more second functions, and information mapping a memory buffer to which result data generated via execution of the one or more second functions is to be written;
      
      send the second offload context to the second sink;
      
      store a second offload context object corresponding to the second offload context in non-volatile storage;
      
      one of receive information identifying the second sink has failed, detect the second sink has failed, or detect execution of the second code section on the second sink has resulted in an error, and in response thereto,retrieve the second offload context object from non-volatile storage;
      
      employ the second offload context object to construct a third offload context including an address of a third sink, one of the second code section or indicia identifying the one or more second functions, one or more function parameters for the one or more second functions, and information mapping a memory buffer to which result data generated via execution of the one or more second functions is to be written;
      
      sending the third offload context to the third sink;
      
      storing a third offload context object corresponding to the third offload context in non-volatile storage;
      
      execute, via the third sink;
      
      the one or more section functions using the one or more function parameters to generate result data;
      
      write the result data to the memory buffer; and
      
      continue execution of the job by the source as if the source executed the one or more second functions.
  - 25. The at least one tangible non-transitory machine-readable medium of claim 22, wherein execution of the instructions by the host processor enable the host processor to detect that a sink has failed to successfully execute a code section that was offloaded to the sink by using a heartbeat monitoring scheme to determine a sink has failed.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Intel Corporation
Original Assignee
Intel Corporation
Inventors
Cheng, Shiow-wen Wendy, Woodruff, Robert J.
Primary Examiner(s)
Mehrmanesh, Elmira

Application Number

US14/571,736
Publication Number

US 20160170849A1
Time in Patent Office

1,386 Days
Field of Search

714 11
US Class Current
CPC Class Codes

G06F 11/2025   using centralised failover ...

G06F 11/203   using migration

G06F 2201/805   Real-time

G06F 2209/509   Offload

G06F 9/5027   the resource being a machin...

G06F 9/52   Program synchronisation; Mu...

Leverage offload programming model for local checkpoints

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

17 Citations

25 Claims

Specification

Solutions

Use Cases

Quick Links

Leverage offload programming model for local checkpoints

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

17 Citations

25 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links