Data analysis and prediction of a dataset through algorithm extrapolation from a spreadsheet formula
First Claim
1. A method of data analysis through an interface of a spreadsheet application, the method comprising:
- determining at least one of a spreadsheet format of a spreadsheet file and a syntax format of the spreadsheet file;
wherein the spreadsheet file defining a formula algorithm accepting a data entry comprising one or more independent variables and the formula algorithm outputting a prediction metric as a dependent variable, the formula algorithm comprising one or more spreadsheet formulas stored in one or more cells of the spreadsheet file, the independent variables referenced from one or more cells of the spreadsheet file, and the prediction metric output in a cell of the spreadsheet file, andwherein the one or more spreadsheet formulas, the independent variable, and the prediction metric are stored in the syntax format permitting independent calculation of the prediction metric for two or more instances of the data entry;
extracting from the spreadsheet file and storing in computer memory the one or more spreadsheet formulas comprising the formula algorithm, the one or more independent variables, and the prediction metric;
assembling each of the one or more spreadsheet formulas into the formula algorithm and storing the formula algorithm in a computer memory;
generating and storing an extrapolated algorithm expressed in a programming language based on the formula algorithm,wherein each spreadsheet formula equivalent to one or more functions of the programming language and each of the one or more independent variables defining a declared variable of at least one of the one or more functions of the programming language;
receiving a dataset comprising two or more data entries in the syntax format usable as an input to the extrapolated algorithm to independently calculate the prediction metric of each of the two or more data entries;
specifying a first computation block comprising one or more data entries of the dataset;
extracting from the dataset each of the one or more data entries within the first computation block;
submitting the first computation block and the extrapolated algorithm to a computing cluster over a network,wherein the extrapolated algorithm applied against the first computation block resulting in a first output block comprising a prediction value of the prediction metric of each instance of the data entry within the first computation block; and
receiving an output data re-combined from data comprising the first output block and one or more additional output blocks.
1 Assignment
0 Petitions
Accused Products
Abstract
Disclosed are a method, a device, a system and/or a manufacture of data analysis and prediction of a dataset through algorithm extrapolation from a spreadsheet formula. In one embodiment, a method extracts one or more spreadsheet formulas from one or more cells of a spreadsheet file and assembles a formula algorithm. The formula algorithm accepts a set of data entries comprising one or more independent variables, and outputs a prediction metric as a dependent variable such that each data entry is calculation independent. An extrapolated algorithm expressed in a programming language is generated. A computation block of a dataset is specified and submitted for computation along with the extrapolated algorithm over a network. The dataset comprising two or more data entries usable as an input to the extrapolated algorithm. An output data re-combined from the first output block and one or more additional output blocks is received.
-
Citations
20 Claims
-
1. A method of data analysis through an interface of a spreadsheet application, the method comprising:
-
determining at least one of a spreadsheet format of a spreadsheet file and a syntax format of the spreadsheet file; wherein the spreadsheet file defining a formula algorithm accepting a data entry comprising one or more independent variables and the formula algorithm outputting a prediction metric as a dependent variable, the formula algorithm comprising one or more spreadsheet formulas stored in one or more cells of the spreadsheet file, the independent variables referenced from one or more cells of the spreadsheet file, and the prediction metric output in a cell of the spreadsheet file, and wherein the one or more spreadsheet formulas, the independent variable, and the prediction metric are stored in the syntax format permitting independent calculation of the prediction metric for two or more instances of the data entry; extracting from the spreadsheet file and storing in computer memory the one or more spreadsheet formulas comprising the formula algorithm, the one or more independent variables, and the prediction metric; assembling each of the one or more spreadsheet formulas into the formula algorithm and storing the formula algorithm in a computer memory; generating and storing an extrapolated algorithm expressed in a programming language based on the formula algorithm, wherein each spreadsheet formula equivalent to one or more functions of the programming language and each of the one or more independent variables defining a declared variable of at least one of the one or more functions of the programming language; receiving a dataset comprising two or more data entries in the syntax format usable as an input to the extrapolated algorithm to independently calculate the prediction metric of each of the two or more data entries; specifying a first computation block comprising one or more data entries of the dataset; extracting from the dataset each of the one or more data entries within the first computation block; submitting the first computation block and the extrapolated algorithm to a computing cluster over a network, wherein the extrapolated algorithm applied against the first computation block resulting in a first output block comprising a prediction value of the prediction metric of each instance of the data entry within the first computation block; and receiving an output data re-combined from data comprising the first output block and one or more additional output blocks. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method of data analysis, the method comprising:
-
extracting from a spreadsheet file and storing in computer memory one or more spreadsheet formulas comprising the formula algorithm, one or more independent variables, and a prediction metric as a dependent variable, wherein the formula algorithm accepting a data entry comprising the one or more independent variables and the formula algorithm outputting the prediction metric as the dependent variable, the independent variables referenced from one or more cells of the spreadsheet file, and the prediction metric output in a cell of the spreadsheet file; assembling each of the one or more spreadsheet formulas into the formula algorithm and storing the formula algorithm in computer memory; retrieving a parse tree generation routine specifying a parse tree assembly procedure for the formula algorithm of the spreadsheet file; assembling a parse tree data of the formula algorithm comprising one or more nodes of the parse tree, each node of the one or more nodes specifying at least one of an instance of the spreadsheet formula, an instance of the dependent variable, and an instance of the independent variable; determining a programming language in which to express the extrapolated algorithm; receiving an output generation routine; deconstructing the parse tree data by; (i) traversing the one or more nodes of the parse tree, (ii) determining one or more of the one or more nodes of the parse tree are a defined node associated with a function specified in the output generation routine, and (iii) mapping each defined node to each instance of function specified in the output generation routine; storing the extrapolated algorithm as an output of the output generation routine; receiving a dataset comprising two or more data entries usable as an input to the extrapolated algorithm to independently calculate the prediction metric of each of the two or more data entries; specifying a first computation block comprising one or more data entries of the dataset; extracting from the dataset each of the one or more data entries within the first computation block; submitting the first computation block and the extrapolated algorithm to a computing cluster over a network, wherein the extrapolated algorithm applied against the first computation block resulting in a first output block comprising a prediction value of the prediction metric of each instance of the data entry within the first computation block; and receiving an output data re-combined from data comprising the first output block and one or more additional output blocks. - View Dependent Claims (13, 14, 15, 16, 17)
-
-
18. A system comprising:
-
an execution server comprising; a processor of the execution server; a memory of the execution server, the memory of the execution server comprising computer readable instructions that when executed on the processor of the execution server; receive an extrapolated algorithm expressed in a programming language based on one or more spreadsheet algorithms extracted from a spreadsheet file; receive a dataset comprising two or more data entries in a syntax format as an input to the extrapolated algorithm to independently calculate a prediction metric of each of the two or more data entries; specify a first computation block comprising one or more of the two or more data entries of the dataset; extract from the dataset each of the one or more of the two or more data entries within the first computation block; submit the first computation block and the extrapolated algorithm to a computing cluster over a network, wherein the extrapolated algorithm applied against the first computation block resulting in a first output block comprising a prediction value for the prediction metric of each instance of the data entry within the computation block; receive an output data re-combined from data comprising the first output block and one or more additional output blocks; upon at least one of submission of the first computation block and generation of the output data, at least one of;
provision a computing virtual machine, provision a computing process container, and initiate a microservice; andstore the output data in at least one of the computing virtual machine, the computing process container, and the microservice; and a network. - View Dependent Claims (19, 20)
-
Specification