Dimension grouping and reduction for model generation, testing, and documentation
First Claim
1. A non-transitory computer readable medium including executable instructions, the instructions being executable by a processor to perform a method, the method comprising:
- receiving analysis data and output indicator, the output indicator indicating a subset of data of the analysis data, the analysis data including multiple dimensions associated with data points;
receiving a lens function identifier, a metric function identifier, and a resolution function identifier;
mapping data points from a transposition of the analysis data, to a reference space utilizing a lens function identified by the lens function identifier, the transposition of the analysis data transforming the analysis data such that the features are data points, the mapping of data points being performed by applying the lens functions across dimensions for each data point of the transposition of the analysis data;
generating a cover of the reference space using a resolution function identified by the resolution identifier;
clustering the data points mapped to the reference space using the cover and a metric function identified by the metric function identifier to determine each node of a plurality of nodes of a graph, each node including at least one data point;
for each node, identifying data points that are members of that node to identify similar features;
grouping features that are members of the same node as being similar to each other;
for each feature, determining correlation with at least some of the subset of data of the analysis data and generate a correlation score;
displaying at least a subset of groups that include features that are similar to each other and display the correlation score for each displayed feature;
receiving a selection of a subset of features from the at least the subset of groups;
generating a set of models, each model including at least one of the selection of the subset of features;
determining fit of each generated model to the subset of data of the analysis data and generate a model score; and
generating a report recommending the model with the highest score.
5 Assignments
0 Petitions
Accused Products
Abstract
An example method includes receiving analysis data and output indicator, mapping data points from a transposition of the analysis data to a reference space, generating a cover of the reference space, clustering the data points mapped to the reference space using the cover and a metric function to determine each node of a plurality of nodes, for each node, identifying data points that are members to identify similar features, grouping features as being similar to each other based on node(s), for each feature, determining correlation with at least some data associated with the output indicator and generate a correlation score, displaying at least groupings of similar features and displaying the correlation scores, receiving a selection of features, generating a set of models based on selection, determining fit of each generated model to output data and generate a model score, and generating a model recommendation report.
-
Citations
19 Claims
-
1. A non-transitory computer readable medium including executable instructions, the instructions being executable by a processor to perform a method, the method comprising:
-
receiving analysis data and output indicator, the output indicator indicating a subset of data of the analysis data, the analysis data including multiple dimensions associated with data points; receiving a lens function identifier, a metric function identifier, and a resolution function identifier; mapping data points from a transposition of the analysis data, to a reference space utilizing a lens function identified by the lens function identifier, the transposition of the analysis data transforming the analysis data such that the features are data points, the mapping of data points being performed by applying the lens functions across dimensions for each data point of the transposition of the analysis data; generating a cover of the reference space using a resolution function identified by the resolution identifier; clustering the data points mapped to the reference space using the cover and a metric function identified by the metric function identifier to determine each node of a plurality of nodes of a graph, each node including at least one data point; for each node, identifying data points that are members of that node to identify similar features; grouping features that are members of the same node as being similar to each other; for each feature, determining correlation with at least some of the subset of data of the analysis data and generate a correlation score; displaying at least a subset of groups that include features that are similar to each other and display the correlation score for each displayed feature; receiving a selection of a subset of features from the at least the subset of groups; generating a set of models, each model including at least one of the selection of the subset of features; determining fit of each generated model to the subset of data of the analysis data and generate a model score; and generating a report recommending the model with the highest score. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method comprising:
-
receiving analysis data and output indicator, the output indicator indicating a subset of data of the analysis data, the analysis data including multiple dimensions associated with data points; receiving a lens function identifier, a metric function identifier, and a resolution function identifier; mapping data points from a transposition of the analysis data, to a reference space utilizing a lens function identified by the lens function identifier, the transposition of the analysis data transforming the analysis data such that the features are data points, the mapping of data points being performed by applying the lens functions across dimensions for each data point of the transposition of the analysis data; generating a cover of the reference space using a resolution function identified by the resolution identifier; clustering the data points mapped to the reference space using the cover and a metric function identified by the metric function identifier to determine each node of a plurality of nodes of a graph, each node including at least one data point; for each node, identifying data points that are members of that node to identify similar features; grouping features that are members of the same node as being similar to each other; for each feature, determining correlation with at least some of the subset of data of the analysis data and generate a correlation score; displaying at least a subset of groups that include features that are similar to each other and display the correlation score for each displayed feature; receiving a selection of a subset of features from the at least the subset of groups; generating a set of models, each model including at least one of the selection of the subset of features; determining fit of each generated model to the subset of data of the analysis data and generate a model score; and generating a report recommending the model with the highest score. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A system comprising:
-
a processor; a memory including instructions to configure the processor to; receive analysis data and output indicator, the output indicator indicating a subset of data of the analysis data, the analysis data including multiple dimensions associated with data points; receive a lens function identifier, a metric function identifier, and a resolution function identifier; map data points from a transposition of the analysis data, to a reference space utilizing a lens function identified by the lens function identifier, the transposition of the analysis data transforming the analysis data such that the features are data points, the mapping of data points being performed by applying the lens functions across dimensions for each data point of the transposition of the analysis data; generate a cover of the reference space using a resolution function identified by the resolution identifier; cluster the data points mapped to the reference space using the cover and a metric function identified by the metric function identifier to determine each node of a plurality of nodes of a graph, each node including at least one data point; for each node, identify data points that are members of that node to identify similar features; group features that are members of the same node as being similar to each other; for each feature, determine correlation with at least some of the subset of data of the analysis data and generate a correlation score; display at least a subset of groups that include features that are similar to each other and display the correlation score for each displayed feature; receive a selection of a subset of features from the at least the subset of groups; generate a set of models, each model including at least one of the selection of the subset of features; determine fit of each generated model to the subset of data of the analysis data and generate a model score; and generate a report recommending the model with the highest score.
-
Specification