Automatic enumeration of data analysis options and rapid analysis of statistical models
First Claim
Patent Images
1. A computer program product for analyzing data, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions readable by a processing circuit to cause the processing circuit to perform a method comprising:
- obtaining a description of a dataset;
automatically generating, by a computer, a plurality of analysis options comprising all possible analysis options for the dataset from the description of the dataset, wherein each of the plurality of analysis options comprises one or more of data filtering conditions, database clauses, statistical functions and model-specific functions;
displaying, by a display, to a user, the plurality of analysis options in a tree layout, wherein the tree layout comprises a plurality of nodes including a plurality of roots and a plurality of leaves connected by a plurality of paths; and
wherein each node in the tree layout represents a different one of each of the one or more data filtering conditions, database clauses, statistical functions and model-specific functions, and each path from a root to a leaf represents an analysis option from the plurality of analysis options;
upon confirmation of the displayed analysis options by the user, generating a plurality of queries based on the analysis options displayed in the tree layout, wherein one query is generated for each of the plurality of analysis options and includes each of the one or more data filtering conditions, database clauses, statistical functions and model-specific functions for the respective analysis option; and
generating one or more models by deploying the queries on the dataset.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments relate to analyzing dataset. A method of analyzing data is provided. The method obtains a description of a dataset. The method automatically generates a plurality of analysis options from the description of the dataset. The method generates a plurality of queries based on the analysis options. The method deploys the queries on the dataset to build a plurality of statistical models from the dataset.
-
Citations
5 Claims
-
1. A computer program product for analyzing data, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions readable by a processing circuit to cause the processing circuit to perform a method comprising:
-
obtaining a description of a dataset; automatically generating, by a computer, a plurality of analysis options comprising all possible analysis options for the dataset from the description of the dataset, wherein each of the plurality of analysis options comprises one or more of data filtering conditions, database clauses, statistical functions and model-specific functions; displaying, by a display, to a user, the plurality of analysis options in a tree layout, wherein the tree layout comprises a plurality of nodes including a plurality of roots and a plurality of leaves connected by a plurality of paths; and wherein each node in the tree layout represents a different one of each of the one or more data filtering conditions, database clauses, statistical functions and model-specific functions, and each path from a root to a leaf represents an analysis option from the plurality of analysis options; upon confirmation of the displayed analysis options by the user, generating a plurality of queries based on the analysis options displayed in the tree layout, wherein one query is generated for each of the plurality of analysis options and includes each of the one or more data filtering conditions, database clauses, statistical functions and model-specific functions for the respective analysis option; and generating one or more models by deploying the queries on the dataset. - View Dependent Claims (2, 3, 4, 5)
-
Specification