Usage reporting from a cloud-hosted, distributed system

US 9,317,395 B2
Filed: 12/15/2011
Issued: 04/19/2016
Est. Priority Date: 11/14/2011
Status: Active Grant

First Claim

Patent Images

1. At a tracker service in a cluster computing environment, a method of collecting usage data, the method comprising:

receiving a request from a first cluster deployment comprising a cluster deployment creator service, the request requesting an interval over which a plurality of cluster deployments, including the first cluster deployment as well as at least a second cluster deployment that is created by the first cluster deployment, are to report usage information in usage reports, the usage information defining at least (i) one or more first characteristics of at least one first job that is assigned to the first cluster deployment and that is being run on a plurality of first worker nodes in the first cluster deployment, and (ii) one or more second characteristics of at least one second job that is assigned to the second cluster deployment and that is being run on a plurality of second worker nodes in the second cluster deployment;

in response to the request from the first cluster deployment, providing the interval to the first cluster deployment;

receiving a plurality of usage reports according to the interval, including receiving at least;

(i) a plurality of first usage reports from a plurality of first aggregator instances running in the first cluster deployment, each of the plurality of first usage reports including a first deployment identifier identifying the first cluster deployment and providing usage information for the at least one first job that is being run on the plurality of first worker nodes, and(ii) a plurality of second usage reports from a plurality of second aggregator instances running in the second cluster deployment, each of the plurality of second usage reports including a second deployment identifier identifying the second cluster deployment and providing usage information for the at least one second job that is run on the plurality of second worker nodes; and

identifying duplicate data in the plurality of usage reports, including;

based on the first deployment identifier, identifying first duplicate data among the plurality of first usage reports regarding the at least one first job, the first duplicate data having been sent by each of at least two of the plurality of first aggregator instances; and

based on the second deployment identifier, identifying second duplicate data among the plurality of second usage reports regarding the at least one second job, the second duplicate data having been sent by each of at least two of the plurality of second aggregator instances.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Collecting usage data in a cluster computing environment. A method includes at a tracker service receiving a request from an at least partially cloud based deployment for an interval for the deployment to report usage information in usage reports. The usage information includes information defining how software in a deployment is used. In response to the request from the deployment, the method further includes, the tracker service providing an interval to the deployment. The method further includes at the tracker service, receiving usage reports from the deployment according to the provided interval.

Citations

17 Claims

1. At a tracker service in a cluster computing environment, a method of collecting usage data, the method comprising:
- receiving a request from a first cluster deployment comprising a cluster deployment creator service, the request requesting an interval over which a plurality of cluster deployments, including the first cluster deployment as well as at least a second cluster deployment that is created by the first cluster deployment, are to report usage information in usage reports, the usage information defining at least (i) one or more first characteristics of at least one first job that is assigned to the first cluster deployment and that is being run on a plurality of first worker nodes in the first cluster deployment, and (ii) one or more second characteristics of at least one second job that is assigned to the second cluster deployment and that is being run on a plurality of second worker nodes in the second cluster deployment;
  
  in response to the request from the first cluster deployment, providing the interval to the first cluster deployment;
  
  receiving a plurality of usage reports according to the interval, including receiving at least;
  
  (i) a plurality of first usage reports from a plurality of first aggregator instances running in the first cluster deployment, each of the plurality of first usage reports including a first deployment identifier identifying the first cluster deployment and providing usage information for the at least one first job that is being run on the plurality of first worker nodes, and(ii) a plurality of second usage reports from a plurality of second aggregator instances running in the second cluster deployment, each of the plurality of second usage reports including a second deployment identifier identifying the second cluster deployment and providing usage information for the at least one second job that is run on the plurality of second worker nodes; and
  
  identifying duplicate data in the plurality of usage reports, including;
  
  based on the first deployment identifier, identifying first duplicate data among the plurality of first usage reports regarding the at least one first job, the first duplicate data having been sent by each of at least two of the plurality of first aggregator instances; and
  
  based on the second deployment identifier, identifying second duplicate data among the plurality of second usage reports regarding the at least one second job, the second duplicate data having been sent by each of at least two of the plurality of second aggregator instances.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
- - 2. The method of claim 1, further comprising pruning each of the first duplicate data and the second duplicate data.
  - 3. The method of claim 1, further comprising merging at least two of the plurality of first usage reports and at least two of the plurality of second usage reports.
  - 4. The method of claim 3, wherein the merging comprises selecting a maximum data value.
  - 5. The method of claim 3, wherein the merging comprises selecting at least one of an average, a mean, or a statistical aggregation of data values.
  - 6. The method of claim 1, wherein at least a first portion of the cluster computing environment is deployed in a cloud and at least a second portion of the cluster computing environment is deployed on-site, and wherein one or more of the usage reports are sent by a proxy that is part of the first portion of the cluster computing environment is deployed in the cloud, and which is a delegate for an on-site head node to worker nodes that are in the cloud.
  - 7. The method of claim 1, wherein one or more of the plurality of usage reports are created by mining a database for job history.
  - 8. The method of claim 1, wherein one or more of the plurality of usage reports is created by an aggregator instance corresponding to a particular worker node.
  - 9. The method of claim 1, further comprising receiving an interim report including usage data at less than the interval, wherein the interim report comprises an indication of an interim interval defining another interval since a last regular report.
  - 10. The method of claim 9, wherein the interim report is sent as a result of a worker node beginning a shut down.
  - 11. The method of claim 9, wherein the interim report is sent as a result of an error or other event.
  - 12. The method of claim 1, wherein one or more of the usage reports comprise customer usage data.
  - 13. The method of claim 1, wherein one or more of the usage reports comprise deployment data for a system deployment system.

14. A computer program product comprising one or more hardware storage devices having stored thereon computer executable instructions that, when executed by one or more processors of a computer system, cause the computer system to collect usage data in a cluster computing environment, including the following:
- receiving a request from a first cluster deployment comprising a cluster deployment creator service, the request requesting an interval over which a plurality of cluster deployments, including the first cluster deployment as well as at least a second cluster deployment that is created by the first cluster deployment, are to report usage information in usage reports, the usage information defining at least (i) one or more first characteristics of at least one first job that is assigned to the first cluster deployment and that is being run on a plurality of first worker nodes in the first cluster deployment, and (ii) one or more second characteristics of at least one second job that is assigned to the second cluster deployment and that is being run on a plurality of second worker nodes in the second cluster deployment;
  
  in response to the request from the first cluster deployment, providing the interval to the first cluster deployment;
  
  receiving a plurality of usage reports according to the interval, including receiving at least;
  
  (i) a plurality of first usage reports from a plurality of first aggregator instances running in the first cluster deployment each of the plurality of first usage reports including a first deployment identifier identifying the first cluster deployment and providing usage information for the at least one first job that is being run on the plurality of first worker nodes, and(ii) a plurality of second usage reports from a plurality of second aggregator instances running in the second cluster deployment, each of the plurality of second usage reports including a second deployment identifier identifying the second cluster deployment and providing usage information for the at least one second job that is run on the plurality of second worker nodes; and
  
  identifying, duplicate data in the plurality of usage reports, including;
  
  based on the first deployment identifier, identifying first duplicate data among the plurality of first usage reports regarding the at least one first job, the first duplicate data having been sent by each of at least two of the plurality of first aggregator instances; and
  
  based on the second deployment identifier, identifying second duplicate data among the plurality of second usage reports regarding the at least one second job, the second duplicate data having been sent by each of at least two of the plurality of second aggregator instances.
- View Dependent Claims (15)
- - 15. The computer program product of claim 14, further comprising pruning each of the first duplicate data and the second duplicate data.

16. A computer system, comprising:
- one or more hardware processors; and
  
  one or more hardware storage devices having stored thereon computer executable instructions representing a tracker service, and wherein the tracker services is configured to perform a least the following;
  
  receive a request a first cluster deployment comprising a cluster deployment creator service, the request requesting an interval over which a plurality of cluster deployments, including the first cluster deployment as well as at least a second cluster deployment that is created by the first cluster deployment, are to report usage information in usage reports, the usage information defining at least (i) one or more first characteristics of at least one first job that is assigned to the first cluster deployment and that is being run on a plurality of first worker nodes in the first cluster deployment, and (ii) one or more second characteristics of at least one second job that is assigned to the second cluster deployment and that is being run on a plurality of second worker nodes in the second cluster deployment;
  
  in response to the request from the first cluster deployment, provide the interval to the first cluster deployment;
  
  receive a plurality of usage reports according to the interval, including receiving at least;
  
  (i) a plurality of first usage reports from a plurality of first aggregator instances running in the first cluster deployment, each of the plurality of first usage reports including a first deployment identifier identifying the first cluster deployment and providing usage information for the at least one first job that is being run on the plurality of first worker nodes, and(ii) a plurality of second usage reports from a plurality of second aggregator instances running in the second cluster deployment, each of the plurality of second usage reports including a second deployment identifier identifying the second cluster deployment and providing usage information for the at least one second job that is run on the plurality of second worker nodes; and
  
  identify duplicate data in the plurality of usage reports, including;
  
  based on the first deployment identifier, identifying first duplicate data among the plurality of first usage reports regarding the at least one first job, the first duplicate data having been sent by each of at least two of the plurality of first aggregator instances; and
  
  based on the second deployment identifier, identifying second duplicate data among the plurality of second usage reports regarding the at least one second job, the second duplicate data having been sent by each of at least two of the plurality of second aggregator instances.
- View Dependent Claims (17)
- - 17. The computer system of claim 16, further comprising pruning each of the first duplicate data and the second duplicate data.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Inventors
Wood, Kevin, Alam, Salim, Burgess, Gregory Marshall, Watson, Colin
Primary Examiner(s)
Gillis, Brian J
Assistant Examiner(s)
Turriate Gastulo, Juan C

Application Number

US13/327,122
Publication Number

US 20130124720A1
Time in Patent Office

1,587 Days
Field of Search

709/224
US Class Current

1/1
CPC Class Codes

G06F 11/3409   for performance assessment

G06F 11/3495   for systems

G06F 2201/88   Monitoring involving counting

Usage reporting from a cloud-hosted, distributed system

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Usage reporting from a cloud-hosted, distributed system

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links