×

System and method for topology-aware job scheduling and backfilling in an HPC environment

DC
  • US 8,984,525 B2
  • Filed: 10/11/2013
  • Issued: 03/17/2015
  • Est. Priority Date: 04/15/2004
  • Status: Active Grant
First Claim
Patent Images

1. A method, comprising:

  • receiving, by one or more computers, submission of a job from a user;

    selecting a virtual cluster of a plurality of communicatively coupled nodes included in a computing environment, the virtual cluster associated with a group of users that submit similar jobs, and comprising a logical grouping of nodes configured to process related jobs, wherein the computing environment is configured to accommodate multiple virtual clusters therein;

    retrieving a policy with one or more of the job and the user, and determining dimensions of the job to determine a job space within the selected virtual cluster, the job space comprising a set of nodes within the virtual cluster dynamically allocated to complete the job, wherein the virtual cluster is configured to accommodate multiple job spaces, with each job space being configured to concurrently execute a separate job;

    determining whether there are a sufficient number of nodes available within the virtual cluster to allocate to the job space, and in the event sufficient nodes are not available within the virtual cluster, determining an earliest available subset of nodes in the virtual cluster on which to execute the job and adding the job to a job queue until the earliest available subset is available within the virtual cluster; and

    upon a determination that a sufficient number of nodes are available within the virtual cluster, dynamically determining an optimum subset of nodes of the virtual cluster, allocating the subset for the job, and executing the job.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×