Guided data exploration
First Claim
Patent Images
1. A method of exploring data, the method comprising:
- receiving the data from a database;
indexing the data in a server to generate a plurality of database datasets;
displaying one or more selectable database datasets of the plurality database datasets from the indexed data, each of the selectable database datasets comprising a plurality of database data records and a plurality of attributes that correspond to columns of the database dataset, wherein data records of a given database dataset comprise values for one or more attributes of the given database dataset;
receiving a selection of one of the selectable database datasets that corresponds to a first plurality of attributes;
receiving a selection of one of the first plurality of attributes;
in response to a selection of the attribute, determining bivariate entropy metrics for a remaining set of the first plurality of attributes relative to the selected attribute, wherein the bivariate entropy metric for a given one of the remaining set of the first plurality of attributes is a function of joint probabilities calculated over different outcomes for the selected attribute and different outcomes for the (liven attribute within the data records of the selected data set, given attribute probabilities calculated over different outcomes of the given attribute within the data records of the selected data set, and selected attribute probabilities calculated over different outcomes of the selected attribute within the data records of the selected data set;
sorting the remaining set of the first plurality of attributes that correspond to the selected dataset based on the determined bivariate entropy metrics; and
displaying the sorted attributes of the selected dataset as selectable attributes.
1 Assignment
0 Petitions
Accused Products
Abstract
A system for exploring data receives the data from a database and indexes the data in a server. The system displays one or more selectable datasets from the indexed data, where the selectable datasets include a plurality of attributes. The system receives a selection of one of the plurality of attributes. The system then sorts the one or more attributes by level of interestingness relative to the selected attribute, and displays the sorted attributes.
-
Citations
20 Claims
-
1. A method of exploring data, the method comprising:
-
receiving the data from a database; indexing the data in a server to generate a plurality of database datasets; displaying one or more selectable database datasets of the plurality database datasets from the indexed data, each of the selectable database datasets comprising a plurality of database data records and a plurality of attributes that correspond to columns of the database dataset, wherein data records of a given database dataset comprise values for one or more attributes of the given database dataset; receiving a selection of one of the selectable database datasets that corresponds to a first plurality of attributes; receiving a selection of one of the first plurality of attributes; in response to a selection of the attribute, determining bivariate entropy metrics for a remaining set of the first plurality of attributes relative to the selected attribute, wherein the bivariate entropy metric for a given one of the remaining set of the first plurality of attributes is a function of joint probabilities calculated over different outcomes for the selected attribute and different outcomes for the (liven attribute within the data records of the selected data set, given attribute probabilities calculated over different outcomes of the given attribute within the data records of the selected data set, and selected attribute probabilities calculated over different outcomes of the selected attribute within the data records of the selected data set; sorting the remaining set of the first plurality of attributes that correspond to the selected dataset based on the determined bivariate entropy metrics; and displaying the sorted attributes of the selected dataset as selectable attributes. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A non-transitory computer readable medium having instructions stored thereon that, when executed by a processor, cause the processor to provided guided data exploration, the providing comprising:
-
receiving the data from a database; indexing the data in a server to generate a plurality of database datasets; displaying one or more selectable database datasets of the plurality database datasets from the indexed data, each of the selectable database datasets comprising a plurality of database data records and a plurality of attributes that correspond to columns of the database dataset, wherein data records of a given database dataset comprise values for one or more attributes of the given database dataset; receiving a selection of one of the selectable database datasets that corresponds to a first plurality of attributes; receiving a selection of one of the first plurality of attributes; in response to a selection of the attribute, determining bivariate entropy metrics for the first plurality of attributes relative to the selected attribute, wherein the bivariate entropy metric for a given one of the remaining set of the first plurality of attributes is a function of joint probabilities calculated over different outcomes for the selected attribute and different outcomes for the given attribute within the data records of the selected data set, given attribute probabilities calculated over different outcomes of the given attribute within the data records of the selected data set, and selected attribute probabilities calculated over different outcomes of the selected attribute within the data records of the selected data set; sorting the remaining set of the first plurality of attributes that correspond to the selected dataset based on the determined bivariate entropy metrics; and displaying the sorted attributes of the selected dataset as selectable attributes. - View Dependent Claims (12, 13, 14, 15, 16, 17)
-
-
18. A guided data exploration system comprising:
-
a processor coupled to a storage device that stores instructions, the processor executing instructions to generate modules comprising; an indexing module that receives data from a database and indexes the data in a server to generate a plurality of database datasets; a display module that displays one or more selectable database datasets of the plurality database datasets from the indexed data, each of the selectable database datasets comprising a plurality of database data records and a plurality of attributes that correspond to columns of the database dataset, that receives a selection of one of the selectable database datasets that corresponds to a first plurality of attributes and that receives a selection of one of the first plurality of attributes, wherein data records of a given database dataset comprise values for one or more attributes of the given database dataset; and a determining module that, in response to a selection of the attribute, determines bivariate entropy metrics for a remaining set of the first plurality of attributes relative to the selected attribute, wherein the bivariate entropy metric for a given one of the remaining set of the first plurality of attributes is a function of joint probabilities calculated over different outcomes for the selected attribute and different outcomes for the given attribute within the data records of the selected data set, given attribute probabilities calculated over different outcomes of the given attribute within the data records of the selected data set, and selected attribute probabilities calculated over different outcomes of the selected attribute within the data records of the selected data set; a sorting module that sorts the remaining set of the first plurality of attributes that correspond to the selected dataset based on the determined bivariate entropy metrics; wherein the display module further displays the sorted attributes of the selected dataset as selectable attributes. - View Dependent Claims (19, 20)
-
Specification