Estimating the cost of data-mining services
First Claim
Patent Images
1. A method comprising:
- receiving a data set;
determining a set of data descriptors based, at least in part, on the data set, wherein the set of data descriptors describes features of the received data set that are relevant for estimating computational resources to run a data-mining task;
receiving a set of control values, wherein the set of control values influences;
(i) an amount of computational resources required for processing the data-mining task, (ii) an accuracy of estimation, and (iii) a duration of time for the estimation to be completed;
receiving a set of data-mining task parameters, wherein the set of task parameters includes an algorithm classifier, wherein the algorithm classifier indicates that the data-mining task is one of;
a regression task or a clustering task;
estimating a set of computational resources required to perform the data-mining task based on a set of parameters including;
the algorithm classifier, the data set, the set of control values, a desired estimation accuracy, the set of data-mining task parameters, and the set of data descriptors; and
estimating a cloud cost for the data mining task;
wherein;
the set of control values describes the data-mining task;
the set of data-mining task parameters defines the data set and the data-mining task; and
at least the estimating step is performed by computer software running on computer hardware.
1 Assignment
0 Petitions
Accused Products
Abstract
The cost of data-mining is estimated where data-mining services are delivered via a distributed computing system environment. System requirements are estimated for a particular data-mining task for an input data set having specified properties. Estimating system requirements includes applying a partial learning tool to operate on sample data from the input data set.
-
Citations
6 Claims
-
1. A method comprising:
-
receiving a data set; determining a set of data descriptors based, at least in part, on the data set, wherein the set of data descriptors describes features of the received data set that are relevant for estimating computational resources to run a data-mining task; receiving a set of control values, wherein the set of control values influences;
(i) an amount of computational resources required for processing the data-mining task, (ii) an accuracy of estimation, and (iii) a duration of time for the estimation to be completed;receiving a set of data-mining task parameters, wherein the set of task parameters includes an algorithm classifier, wherein the algorithm classifier indicates that the data-mining task is one of;
a regression task or a clustering task;estimating a set of computational resources required to perform the data-mining task based on a set of parameters including;
the algorithm classifier, the data set, the set of control values, a desired estimation accuracy, the set of data-mining task parameters, and the set of data descriptors; andestimating a cloud cost for the data mining task; wherein; the set of control values describes the data-mining task; the set of data-mining task parameters defines the data set and the data-mining task; and at least the estimating step is performed by computer software running on computer hardware. - View Dependent Claims (2, 3)
-
-
4. A computer system comprising:
-
a processor set; and a computer readable storage medium; wherein; the processor set is structured, located, connected, and/or programmed to execute instructions stored on the computer readable storage medium; and the instructions include; first instructions executable by a device to cause the device to receive a data set and determine a set of data descriptors based, at least in part, on the data set, wherein the set of data descriptors describes features of the received data set that are relevant for estimating computational resources to run a data-mining task; second instructions executable by a device to cause the device to receive a set of control values, wherein the set of control values influences;
(i) an amount of computational resources required for processing the data-mining task, (ii) an accuracy of estimation, and (iii) a duration of time for the estimation to be completed;third instructions executable by a device to cause the device to receive a set of data-mining task parameters, wherein the set of task parameters includes an algorithm classifier, wherein the algorithm classifier indicates that the data-mining task is one of;
a regression task or a clustering task;fourth instructions executable by a device to cause the device to estimate a set of computational resources required to perform a data-mining based on a set of parameters including;
the algorithm classifier, the data set, the set of control values, a desired estimation accuracy, the set of data-mining task parameters, and the set of data descriptors; andfifth instructions estimating a cloud cost for the data mining task; wherein; the set of control values describes the data-mining task; and the set of data-mining task parameters defines the data set and the data-mining task. - View Dependent Claims (5, 6)
-
Specification