Multidimensional analysis tool for high dimensional data
First Claim
1. A method performed by one or more computing devices, the method comprising:
- accessing a master dataset, the master dataset comprising a plurality of dimensions and a plurality of rows, each dimension comprising a category or type of data, each row comprised of values corresponding to the dimensions, the rows forming columns of values that correspond to the dimensions, the master dataset further comprising a dimension table for a given dimension and an inverted index for the dimension table, the dimension table comprising rows, each row comprising a value or range of values corresponding to the category or type of data of the given dimension, the inverted index identifying, for each row of the dimension table, a corresponding row in the master dataset that contains a value of the dimension that matches the value or range of values of the row of the dimension table;
obtaining a final dataset from the master dataset based on analysis parameters, the analysis parameters including at least one parameter that identifies the given dimension and defines a set of values for the given dimension, wherein the final dataset is comprised of rows of master dataset;
finding the rows of the final dataset by using the inverted index to find rows of the master dataset that have values for the given dimension that match the set of values defined by the analysis parameters;
performing at least one of;
a grouping operation, an aggregating operation or a sorting operation, or any combination of a grouping operation, an aggregating operation or a sorting operation to configure a result set from the final dataset; and
outputting the result set.
2 Assignments
0 Petitions
Accused Products
Abstract
Described is a technology by which high dimensional data may be efficiently analyzed, including by filtering, grouping, aggregating and/or sorting operations to provide an analysis result. For efficiency in the analysis, an inverted index may be built (e.g., as part of filtering), and/or a hash structure (e.g., as part of grouping). Analysis parameters specify dimensions, on which union and/or intersection operations are performed to provide a final dataset. The analysis tool provides a user interface for inputting analysis parameters and outputting information corresponding to an analysis result. The analysis tool may sort the information corresponding to the analysis result, e.g., to output the topmost or bottommost results.
-
Citations
17 Claims
-
1. A method performed by one or more computing devices, the method comprising:
-
accessing a master dataset, the master dataset comprising a plurality of dimensions and a plurality of rows, each dimension comprising a category or type of data, each row comprised of values corresponding to the dimensions, the rows forming columns of values that correspond to the dimensions, the master dataset further comprising a dimension table for a given dimension and an inverted index for the dimension table, the dimension table comprising rows, each row comprising a value or range of values corresponding to the category or type of data of the given dimension, the inverted index identifying, for each row of the dimension table, a corresponding row in the master dataset that contains a value of the dimension that matches the value or range of values of the row of the dimension table; obtaining a final dataset from the master dataset based on analysis parameters, the analysis parameters including at least one parameter that identifies the given dimension and defines a set of values for the given dimension, wherein the final dataset is comprised of rows of master dataset; finding the rows of the final dataset by using the inverted index to find rows of the master dataset that have values for the given dimension that match the set of values defined by the analysis parameters; performing at least one of;
a grouping operation, an aggregating operation or a sorting operation, or any combination of a grouping operation, an aggregating operation or a sorting operation to configure a result set from the final dataset; andoutputting the result set. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. At least one computer-readable storage medium having computer-executable instructions, which when executed by a computer perform steps, the steps comprising:
-
accessing a master dataset, the master dataset comprising a plurality of dimensions and a plurality of rows, each dimension comprising a dimension value identifying a category or type of data, each row comprised of values corresponding to the dimensions, the rows forming columns of values that correspond to the dimensions; the master dataset further comprising a dimension table for a given dimension of the master dataset, comprised of rows, each row comprising a definition of a range of values corresponding to the category or type of data of the dimension table'"'"'s dimension; an inverted index comprising, for each row of the dimension table, one or more identifiers of rows of the master dataset, wherein each such identified master dataset row has a value for the given dimension that falls within the range of values defined by a corresponding row of the dimension table; receiving analysis parameters interactively inputted via a user interface; performing a filtering operation on the master dataset by using the inverted index to find a set of identifiers of rows in the master dataset and a plurality of associated dimension values for each identifier; grouping the dimension values via a hash operation into grouped dimension values, in which identifiers having identical dimension values correspond to the same dimension group, and using the grouped dimension values to output a result set. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. At least one computer-readable storage medium having computer-executable instructions, which when executed by a computer perform steps, the steps comprising:
-
accessing a master dataset, the master dataset comprising a plurality of dimensions and a plurality of rows, each dimension comprising a category or type of data, each row comprised of values corresponding to the dimensions, the rows forming columns of values that correspond to the dimensions, the master dataset further comprising a dimension table for a given dimension and an inverted index for the dimension table, the dimension table comprising rows, each row comprising a value or range of values corresponding to the category or type of data of the given dimension, the inverted index identifying, for each row of the dimension table, a corresponding row in the master dataset that contains a value of the dimension that matches the value or range of values of the row of the dimension table; obtaining a final dataset from the master dataset based on analysis parameters, the analysis parameters including at least one parameter that identifies the given dimension and defines a set of values for the given dimension, wherein the final dataset is comprised of rows of master dataset; finding the rows of the final dataset by using the inverted index to find rows of the master dataset that have values for the given dimension that match the set of values defined by the analysis parameters; performing at least one of;
a grouping operation, an aggregating operation or a sorting operation, or any combination of a grouping operation, an aggregating operation or a sorting operation to configure a result set from the final dataset; andoutputting the result set. - View Dependent Claims (17)
-
Specification