Modeling sequence and time series data in predictive analytics
First Claim
1. A declarative data modeling language system for predicting sequences and time series data automatically, and by identifying patterns without manual pattern identification or validation, comprising:
- a processor and memory;
a data modeling language component that automatically generates at least one data mining model to extract predictive information from at least one database, and in a manner that does not require manual identification or validation of a predictive pattern;
a plurality of language extension components configured in the data modeling language, the plurality of language extension components providing at least;
a data sequence model in the data modeling language to generate sequence predictions;
a time series model in the data modeling language and facilitating generating time series predictions of at least one of a casual or discrete subsequent data value in a time series, wherein the sequence model and the time series model are separate models, and in which the data sequence model predicts events based at least in part on historical event data, and the time series model predicts numerical time values based on historical numerical time value data;
wherein one or both of the data sequence model or the time series model include schema rowsets stores that include contents of a mining model according to a transition matrix for clustering sequences and storing probabilities of transitions between different states;
wherein the schema rowsets include All, Cluster and Sequence, in which;
All is a node that is a root and represents a model;
Cluster is a child of All; and
Sequence is a child of All that stores a marginal transition matrix, and in which each Cluster has a Sequence child that contains a set of children, each of which is a column in the transition matrix; and
wherein the memory configured to the processor retains at least one piece of information that pertains to the data modeling language component or the language extension components when directed to the processor.
2 Assignments
0 Petitions
Accused Products
Abstract
The subject invention relates to systems and methods to extend the capabilities of declarative data modeling languages. In one aspect, a declarative data modeling language system is provided. The system includes a data modeling language component that generates one or more data mining models to extract predictive information from local or remote databases. A language extension component facilitates modeling capability in the data modeling language by providing a data sequence model or a time series model within the data modeling language to support various data mining applications.
61 Citations
20 Claims
-
1. A declarative data modeling language system for predicting sequences and time series data automatically, and by identifying patterns without manual pattern identification or validation, comprising:
-
a processor and memory; a data modeling language component that automatically generates at least one data mining model to extract predictive information from at least one database, and in a manner that does not require manual identification or validation of a predictive pattern; a plurality of language extension components configured in the data modeling language, the plurality of language extension components providing at least; a data sequence model in the data modeling language to generate sequence predictions; a time series model in the data modeling language and facilitating generating time series predictions of at least one of a casual or discrete subsequent data value in a time series, wherein the sequence model and the time series model are separate models, and in which the data sequence model predicts events based at least in part on historical event data, and the time series model predicts numerical time values based on historical numerical time value data; wherein one or both of the data sequence model or the time series model include schema rowsets stores that include contents of a mining model according to a transition matrix for clustering sequences and storing probabilities of transitions between different states; wherein the schema rowsets include All, Cluster and Sequence, in which; All is a node that is a root and represents a model; Cluster is a child of All; and Sequence is a child of All that stores a marginal transition matrix, and in which each Cluster has a Sequence child that contains a set of children, each of which is a column in the transition matrix; and wherein the memory configured to the processor retains at least one piece of information that pertains to the data modeling language component or the language extension components when directed to the processor. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A computer implemented method for generating data mining models and predicting sequences and time series data automatically, and by identifying patterns without manual pattern identification or validation, comprising:
-
at a computing system, executing computer-executable instructions using one or more processors, wherein execution of the computer-executable instructions directs the computing system to; provide a plurality of language extensions to a database modeling language, the plurality of language extensions including; at least one data sequence model in the database modeling language to generate sequence predictions; at least one time series model in the database modeling language to generate time series predictions, wherein the at least one data sequence model and the at least one time series model are separate models, and in which the data sequence model predicts events based at least in part on historical event data, and the time series model predicts numerical values based on historical numerical value data; wherein the schema rowsets include All, Cluster and Sequence, in which; All is a node that is a root and represents a model; Cluster is a child of All; and Sequence is a child of All that stores a marginal transition matrix, and in which each Cluster has a Sequence child that contains a set of children, each of which is a column in the transition matrix; and automatically, with a computing system and without manual pattern identification or validation, generate data mining models from the plurality of language extensions; generate a query for a database; and automatically generate at least one sequence prediction and at least one time series prediction from the database based on the query and the data mining models, wherein the sequence prediction predicts a future event and the time series prediction predicts a future numerical time value and is based on at least one of casual data or discrete data. - View Dependent Claims (17, 18)
-
-
19. A system to facilitate data mining operations and predict sequences at time series data automatically, and by identifying patterns without manual pattern identification or verification, comprising:
-
one or more computer-readable media having stored thereon computer executable instructions that, when executed by a processor, cause the system to; A processor and memory; query a relational database; generate a data mining model to determine predictive information from the database; modify the data mining model to each of a casual data time series, discrete data time series, and a data sequence; generate probabilities from the database in view of the data time series or the data sequence, such that; probabilities associated with the casual data time series and the discrete data time series predict future numerical time values based on historical numerical time values; and wherein the schema rowsets include All, Cluster and Sequence, in which; All is a node that is a root and represents a model; Cluster is a child of All; and Sequence is a child of All that stores a marginal transition matrix, and in which each Cluster has a Sequence child that contains a set of children, each of which is a column in the transition matrix; and probabilities associated with the data sequence predict future events based on historical event data; and wherein the memory configured to the processor to the one or more computer-readable media and which, upon request, executes at least one command in relation to the aforementioned querying or generating. - View Dependent Claims (20)
-
Specification