Model optimization system using variable scoring
First Claim
1. A model optimization system to determine quality of variables for model generation, the system comprising:
- data storage to store data including input variables, quality metrics for the input variables, and weights for the quality metrics, wherein the quality metrics quantify a sufficiency of the input variables in generating an accurate model, and wherein the data is provided for a plurality of regions;
a processor to execute;
a scoring module to;
determine a quality metric score for each quality metric based on measurements for the quality metric; and
determine an input variable score for each input variable based on the quality metric scores and the quality metric weights;
determine a total score for each region of the plurality of regions based on the input variable scores, wherein to determine the total score for each region, the scoring module is to;
determine categories for the input variables, wherein each category is associated with a type of input variable;
determine category weights for each category;
determine a category score for each category based on the input variable scores for the input variables associated with each category; and
determine the total score for each region based on the category scores and the category weights; and
an optimizer todetermine whether at least one of the input variables for a region of the plurality of regions is to be modified based on the total score for the region, andin response to a determination that at least one of the input variables for a region of the plurality of regions is to be modified, determine whether the total score for the region is to be improved by applying a modified input variable having a different quality metric.
2 Assignments
0 Petitions
Accused Products
Abstract
A model optimization system is configured to determine quality of variables for model generation. A data storage stores input variables, quality metrics for the input variables, and weights for the quality metrics. The quality metrics describe sufficiency of data for the input variables and the data is provided for a plurality of regions. A scoring module determines a score for each region based on the input variables and the weighted quality metrics. An optimizer determines whether at least one of the input variables for a region is to be modified based on the scores, and determines whether the total score for the region is operable to be improved using a modified input variable.
-
Citations
17 Claims
-
1. A model optimization system to determine quality of variables for model generation, the system comprising:
-
data storage to store data including input variables, quality metrics for the input variables, and weights for the quality metrics, wherein the quality metrics quantify a sufficiency of the input variables in generating an accurate model, and wherein the data is provided for a plurality of regions; a processor to execute; a scoring module to; determine a quality metric score for each quality metric based on measurements for the quality metric; and determine an input variable score for each input variable based on the quality metric scores and the quality metric weights; determine a total score for each region of the plurality of regions based on the input variable scores, wherein to determine the total score for each region, the scoring module is to; determine categories for the input variables, wherein each category is associated with a type of input variable; determine category weights for each category; determine a category score for each category based on the input variable scores for the input variables associated with each category; and determine the total score for each region based on the category scores and the category weights; and an optimizer to determine whether at least one of the input variables for a region of the plurality of regions is to be modified based on the total score for the region, and in response to a determination that at least one of the input variables for a region of the plurality of regions is to be modified, determine whether the total score for the region is to be improved by applying a modified input variable having a different quality metric. - View Dependent Claims (2, 3, 4)
-
-
5. A method for determining quality of data for modeling, the method comprising:
-
identifying, by a processor executing computer instructions, input variables to estimate a dependent variable, wherein the input variables are stored in a data storage; determining quality metrics that quantify a sufficiency of the input variables in generating an accurate model, wherein the data is provided for a plurality of regions; determining a quality metric score for each quality metric based on measurements for the quality metric; determining an input variable score for each input variable based on the quality metric scores and quality metric weights for the quality metrics; determining a total score for each region of the plurality of regions based on the input variable scores, wherein determining the total score for each region comprises; determining categories for the input variables, wherein each category is associated with a type of input variable; determining category weights for each category; determining a category score for each category based on the input variable scores for the input variables associated with each category; and determining the total score for each region based on the category scores and the category weights; determining whether at least one of the input variables for a region of the plurality of regions is to be modified based on the total score for the region; and upon determining that at least one of the input variables for a region of the plurality of regions is to be modified, determining whether the total score for the region is to be improved by applying a modified input variable having a different quality metric. - View Dependent Claims (6, 7, 8, 9, 10, 11)
-
-
12. A non-transitory computer readable medium storing computer readable instructions that when executed by a processor perform a method for determining quality of data for modeling, the method comprising:
-
identifying input variables to estimate a dependent variable; determining quality metrics that quantify a sufficiency of the input variables in generating an accurate model, wherein the data is provided for a plurality of regions; determining a quality metric score for each quality metric based on measurements for the quality metric; determining an input variable score for each input variable based on the quality metric scores and quality metric weights for the quality metrics; determining a total score for each region of the plurality of regions based on the input variable scores, wherein determining the total score for each region comprises; determining categories for the input variables, wherein each category is associated with a type of input variable; determining category weights for each category; determining a category score for each category based on the input variable scores for the input variables associated with each category; and determining the total score for each region based on the category scores and the category weights; determining whether at least one of the input variables for a region of the plurality of regions is to be modified based on the total score for the region; and upon determining that at least one of the input variables for a region of the plurality of regions is to be modified, determining whether the total score for the region is to be improved by applying a modified input variable having a different quality metric. - View Dependent Claims (13, 14, 15, 16, 17)
-
Specification