Multi-pass data organization and automatic naming
First Claim
1. A system comprising:
- a server at least in selective communication with a client machine, the server configured to;
receive a query from the client machine;
retrieve a data set based on the query, andorganize the data set into subsets with at least a first pass and a second pass, wherein the first pass is statistic driven and the second pass is attribute driven, wherein the statistic driven first pass is selected from a set consisting essentially of organizational clustering and hierarchical clustering, and wherein the second pass is to partition a subset of the data set that results from the first pass, andname each of the subsets, based at least in part on a property shared by at least a majority of the data units of the subset.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and a system to organize a data set into groups of data subsets in multiple passes using different parameters and to automatically name the groups is disclosed. For example, a data set is retrieved in accordance with a search query submitted by a user. The data set is organized into clusters based on a statistic(s) of the data set. The data set is then organized into groups of data subsets based on an attribute(s) indicated by the data set. Each of the groups are automatically named based on a property shared by data units of the group. The name(s) of a group may be mined from the data units of the group, retrieved from a structure that maps to attribute values indicated by the data units of the group, etc.
81 Citations
22 Claims
-
1. A system comprising:
-
a server at least in selective communication with a client machine, the server configured to; receive a query from the client machine; retrieve a data set based on the query, and organize the data set into subsets with at least a first pass and a second pass, wherein the first pass is statistic driven and the second pass is attribute driven, wherein the statistic driven first pass is selected from a set consisting essentially of organizational clustering and hierarchical clustering, and wherein the second pass is to partition a subset of the data set that results from the first pass, and name each of the subsets, based at least in part on a property shared by at least a majority of the data units of the subset. - View Dependent Claims (2, 3)
-
-
4. A method comprising the acts of:
-
receiving, at a server, a query from a client machine; retrieving a data set based on the query, the data set containing a plurality of data units; organizing the plurality of data units into clusters, based at least in part on one or more statistics of the plurality of data units; organizing the organized plurality of data units into at least a first group and a second group based on at least one attribute indicated by the plurality of data units, wherein the data units of the first group share a first similarity with respect to the at least one attribute and the data units of the second group share a second similarity with respect to the at least one attribute; and automatically naming the first group based, at least in part, on a first property shared by at least a majority of the data units of the first group and automatically naming the second group based, at least in part, on a second property shared by at least a majority of the data units of the second group. - View Dependent Claims (5, 6, 7, 8, 9, 10, 11)
-
-
12. A method of organizing data, the method comprising the acts of:
-
receiving, at a server, a query from a client machine; retrieving a data set based on the query; organizing the data set into groups of data subsets with at least a first pass and a second pass over the data set, wherein the first pass employs statistic driven clustering and the second pass employs attribute driven clustering; wherein the statistic driven first pass is selected from a set consisting essentially of organizational clustering and hierarchical clustering, and wherein the second pass is to partition a subset of the data set that results from the first pass; and automatically naming each of the groups of data subsets based, at least in part, on similarity of the data subsets in each group. - View Dependent Claims (13)
-
-
14. A set of instructions encoded on one or more machine-readable storage media, the set of instructions comprising:
-
a first sequence of instructions executable to employ statistical clustering to organize a plurality of data units; and a second sequence of instructions executable to employ structural clustering to organize the plurality of data units organized by the first sequence of instructions into groups with respect to an attribute indicated by the data set; and a third sequence of instructions executable to indicate one or more names for each of the groups, and to access a structure for the one or more names; and a fourth sequence of instructions executable to generate the one or more names for each group based, at least in part, on at least one shared across at least a majority of data units within each group. - View Dependent Claims (15, 16)
-
-
17. An apparatus comprising:
-
a memory operable to host a set of data; means for grouping the set of data into plural groups based on one or more statistics for the set of data and based on similarities among the set of data with respect to an attribute indicated by the set of data;
wherein the grouping includes a least a first past and a second pass, wherein the first pass is selected from a set consisting essentially of organizational clustering and hierarchical clustering, and wherein the second pass is to partition a subset of the data set that results from the first pass; andmeans for automatically naming the plural groups. - View Dependent Claims (18, 19)
-
-
20. An apparatus comprising:
-
a memory operable to host a plurality of data units; a navigation module operable to retrieve a plurality of data units responsive to a query; an organizing module coupled with the navigation module, the organizing module operable to organize the plurality of retrieved data units in accordance with a set of one or more statistics for the plurality of data units and to then organize the plurality of data units into groups in accordance with at least one attribute indicated by the plurality of data units; and a naming module coupled with the organizing module, the naming module operable to name each of the groups, based at least in part on a property shared by at least a majority of the data units within the group. - View Dependent Claims (21, 22)
-
Specification