Automated method and system for generating models from data
First Claim
1. In a computer system having at least one processor, at least one memory unit, an input device and an output device, a method of automatically constructing computer representations of a plurality of models from data and providing those constructed computer representations as models of physical phenomena or of commercially significant phenomena in memory for making predictions or for revealing previously unknown data relationships and for use by a human or use by a computer acting on behalf of one or more humans wherein the use is a basis for decision-making, comprising computer implemented steps of:
- a) using at least one sample set from available data;
b) obtaining one or more goals for the models from a human or from a computer acting on behalf of one or more humans;
c) obtaining ROC convex hull performance criteria for the models, wherein the performance criteria select models to satisfy the one or more goals;
d) using a plurality of parameter choices associated with the methods;
e) using a plurality of methods, and a plurality of parameter choices, for inferring a plurality of models;
f) rating performance of the inferred models, based on one or more criteria; and
g) constructing and evaluating weighted combinations of the inferred models with respect to the performance criteria.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention relates to a scaleable automatic method of using multiple techniques to generate models and combinations of models from data and prior knowledge. The system provides unprecedented ease of use in that many of the choices of technique and parameters are explored automatically by the system, without burdening the user, and provides scaleable learning over distributed processors to achieve speed and data-handling capacity to satisfy the most demanding requirements.
257 Citations
59 Claims
-
1. In a computer system having at least one processor, at least one memory unit, an input device and an output device, a method of automatically constructing computer representations of a plurality of models from data and providing those constructed computer representations as models of physical phenomena or of commercially significant phenomena in memory for making predictions or for revealing previously unknown data relationships and for use by a human or use by a computer acting on behalf of one or more humans wherein the use is a basis for decision-making, comprising computer implemented steps of:
-
a) using at least one sample set from available data; b) obtaining one or more goals for the models from a human or from a computer acting on behalf of one or more humans; c) obtaining ROC convex hull performance criteria for the models, wherein the performance criteria select models to satisfy the one or more goals; d) using a plurality of parameter choices associated with the methods; e) using a plurality of methods, and a plurality of parameter choices, for inferring a plurality of models; f) rating performance of the inferred models, based on one or more criteria; and g) constructing and evaluating weighted combinations of the inferred models with respect to the performance criteria. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30)
-
-
31. In a computer system having at least one processor, at least one memory unit, an input device and an output device, a method for constructing computer representations of new knowledge of physical phenomena or of commercially significant phenomena in the form of supported hypotheses stored in computer memory from data and for providing the computer representations of supported hypotheses for making predictions or for revealing previously unknown data relationships and for use by a human or use by a computer acting on behalf of one or more humans wherein the use is as a basis for decision-making, comprising the steps of:
-
a) encoding least one model in terms of at least one variable; b) associating the variable with at least one class of items; c) encoding a plurality of hypotheses as variations to the at least one model, wherein the range of the at least one variable is transformed to a different range; d) associating the at least one variable of the at least one model with at least one information source; and e) selecting at least one tuple from the information source, along with corresponding model outputs for using as evidence that supports or refutes the hypotheses. - View Dependent Claims (32, 33, 34, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59)
-
-
35. The Method of 34, further comprising creating a description of the at least one model, by restricting at least one distinguished parameter or at least one distinguished variable by at least one label selected from the group consisting of:
-
inputs, outputs, belief calculus, provenance rules, cost and range of applicability, credibility, ownership, and access authorization; wherein belief calculus is any system for ascribing the degree of a belief in the outputs of the at least one represented model as a function of the degree of belief in the values of at least one parameter or at least one variable of the model.
-
-
36. The Method of 35, further comprising improving the efficiency or explanatory power of the conceptual organization of the at least one represented model by repeating the steps of:
-
a) creating a mapping between the parameter or variable of the model and information source; b) adding at least one new model; c) re-organizing some or all of the existing conceptual organization. - View Dependent Claims (37, 38)
-
-
39. The Method of 35, wherein the belief calculus is at least one method selected from the group consisting of:
Bayesian belief networks, Dempster-Schafer evidence models, fuzzy logic, non-axiomatic reasoning methods, transferable belief models, Bonissone'"'"'s real-time system for reasoning with uncertainty, certainty factor systems, statistical reasoning, Lowrance'"'"'s evidential intervals, causal networks, non-monotonic logic, truth maintenance systems, and logic-based abduction.
Specification