System and method for improving data coverage in modeling systems
First Claim
1. A method for modifying data coverage in a modeling computer system, comprising:
- obtaining data records relating to a plurality of input variables and one or more output parameters;
selecting a plurality of input parameters from the plurality of input variables;
evaluating, by a processor of the modeling computer system, a coverage of the data records in a modeling space by determining a data density of the data records in the modeling space, the evaluating including;
dividing the modeling space into a plurality of hyper-quadrants;
calculating a data density for each hyper-quadrant based on a number of data records in the respective hyper-quadrant;
generating a histogram based on the calculated data densities of each hyper-quadrant; and
determining a statistical difference in a data density distribution between the hyper-quadrants based on the generated histogram;
detecting a data coverage condition based on whether the determined statistical difference exceeds a first threshold; and
when a data coverage condition is detected;
modifying the coverage of the data records; and
generating a computational model indicative of interrelationships between the plurality of input parameters and the one or more output parameters based on the modified data records.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for modifying data coverage in a modeling system is disclosed. The method may include obtaining data records relating to a plurality of input variables and one or more output parameters and selecting a plurality of input parameters from the plurality of input variables. The method may further include evaluating a coverage of the data records in a modeling space and modifying the coverage of the data records, if a data coverage condition is detected. The method may also include generating a computational model indicative of interrelationships between the plurality of input parameters and the one or more output parameters based on the data records.
165 Citations
20 Claims
-
1. A method for modifying data coverage in a modeling computer system, comprising:
-
obtaining data records relating to a plurality of input variables and one or more output parameters; selecting a plurality of input parameters from the plurality of input variables; evaluating, by a processor of the modeling computer system, a coverage of the data records in a modeling space by determining a data density of the data records in the modeling space, the evaluating including; dividing the modeling space into a plurality of hyper-quadrants; calculating a data density for each hyper-quadrant based on a number of data records in the respective hyper-quadrant; generating a histogram based on the calculated data densities of each hyper-quadrant; and determining a statistical difference in a data density distribution between the hyper-quadrants based on the generated histogram; detecting a data coverage condition based on whether the determined statistical difference exceeds a first threshold; and when a data coverage condition is detected; modifying the coverage of the data records; and generating a computational model indicative of interrelationships between the plurality of input parameters and the one or more output parameters based on the modified data records. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer-based modeling system, comprising:
-
a database containing data records related to a plurality of input variables and one or more output parameters; and a processor configured to; obtain data records relating to the plurality of input variables and one or more output parameters; select a plurality of input parameters from the plurality of input variables; evaluate a coverage of the data records in a modeling space by determining a data density of the data records in the modeling space, including; dividing the modeling space into a plurality of hyper-quadrants; and calculating the data density for each hyper-quadrant based on a number of data records in the respective hyper-quadrant; detecting a data coverage condition based on the determined data density of the data records in the modeling space; and when a data coverage condition is detected; modify the coverage of the data records; and generate a computational model indicative of interrelationships between the plurality of input parameters and the one or more output parameters based on the modified data records. - View Dependent Claims (11, 12, 13, 14, 15, 16)
-
-
17. A non-transitory computer-readable storage medium having stored thereon instructions which, when executed by a computer, cause the computer to perform a method for modifying data coverage in a modeling system, the method including:
-
obtaining data records relating to a plurality of input variables and one or more output parameters; selecting a plurality of input parameters from the plurality of input variables; evaluating a coverage of the data records in a modeling space by determining a data density of the data records in the modeling space, including; dividing the modeling space into a plurality of hyper-quadrants; and calculating the data density for each hyper-quadrant based on a number of data records in the respective hyper-quadrant; detecting a data coverage condition based on the determined data density of the data records in the modeling space; and when a data coverage condition is detected; prompting a user of the modeling system to select from one of a plurality of methods for modifying the coverage of the data records; receiving a selection from the user of one of the plurality of methods for modifying the coverage of the data records; modifying the coverage of the data records according to the method selected by the user; and generating a computational model indicative of interrelationships between a plurality of input parameters and the one or more output parameters based on the modified data records. - View Dependent Claims (18, 19)
-
-
20. A method for modifying data coverage in a modeling computer system, comprising:
-
obtaining data records relating to a plurality of input variables and one or more output parameters; dividing the data records into a plurality of hyper-quadrants in a modeling space; determining, by a processor of the modeling computer system, data densities of the data records in the hyper-quadrants; determining whether a data density of the data records in at least one of the hyper-quadrants is below a threshold; and when a data density of the data records in at least one of the hyper-quadrants is below the threshold; prompting a user of the modeling computer system to select a method to modify the data densities of the data records; modifying the data densities of the data records, based on the method selected by the user, to increase the data density of the data records in the at least one hyper-quadrant or to decrease the data densities of the data records in at least one of the remaining hyper-quadrants; and generating a computational model indicative of interrelationships between the plurality of input parameters and the one or more output parameters based on the modified data records.
-
Specification