Method and apparatus for data clustering
First Claim
1. A method for clustering a plurality of data inputs into groups, comprising:
- (a) defining a match threshold;
(b) designating a first data input as center of a group;
(c) analyzing another data input to identify a group whose center has a proximity to the input that is above the match threshold, and if such a group is identified, assigning the data input to that group;
(d) if the data input has a proximity to the center of no group above the match threshold, creating a new group and designating said data input as center of the new group; and
(e) repeating steps (c) and (d) until all data inputs have been assigned to groups.
3 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus are provided for clustering data inputs into groups. The first data input is initially designated as center of a first group. Each other data input is successively analyzed to identify a group whose center is sufficiently close to that data input. If such a group is identified, the input is assigned to the identified group. If no such group is identified, a new group is created and the data input is designated as the center of the new group. The analysis of data inputs is repeated until all data inputs have been assigned to groups. Optionally, thereafter for optimal performance, for each data input, the closest group center to that input is determined, and the data input is assigned to the group having that center.
35 Citations
38 Claims
-
1. A method for clustering a plurality of data inputs into groups, comprising:
-
(a) defining a match threshold;
(b) designating a first data input as center of a group;
(c) analyzing another data input to identify a group whose center has a proximity to the input that is above the match threshold, and if such a group is identified, assigning the data input to that group;
(d) if the data input has a proximity to the center of no group above the match threshold, creating a new group and designating said data input as center of the new group; and
(e) repeating steps (c) and (d) until all data inputs have been assigned to groups. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer program product in computer-readable media for clustering a plurality of data inputs into groups, the computer program product comprising:
-
means for designating a first data input as center of a group; and
means for successively analyzing each of the other data inputs to identify a group having a center whose proximity to the data input is above a predetermined match threshold, assigning said data input to the identified group; and
if no group is identified, creating a new group and designating the data input as center of the new group; and
repeating data input analysis until all data inputs have been assigned to groups. - View Dependent Claims (11, 12, 13, 14, 15, 16, 18, 19, 20, 21, 22, 23, 25, 26, 27, 28, 29, 30, 31, 32)
-
-
17. A computer, comprising:
-
at least one processor;
memory associated with the at least one processor;
a display; and
a program supported in the memory for clustering a plurality of data inputs into groups, the program comprising;
means for designating a first data input as center of a group; and
means for successively analyzing each other data inputs to identify a group center closest to each data input, and if the proximity between the data input and the closest group center is above a predetermined match threshold, assigning said data input to the group having said group center; and
if the proximity between the data input to the closest group center is not above the match threshold, creating a new group and designating the data input as center of the new group; and
repeating data input analysis until all data inputs have been assigned to groups.means for successively analyzing each of the other data inputs to identify a group having a center whose proximity to the data input is above a predetermined match threshold, assigning said data input to the identified group; and
if no group is identified, creating a new group and designating the data input as center of the new group; and
repeating data input analysis until all data inputs have been assigned to groups.
-
-
24. A method of suggesting a Web site to a Web user, comprising:
-
identifying a group of Web users having similar profiles;
recording Web sites visited by Web users in the group;
for a Web user in the group, determining which of the sites visited by other users in the group have not been visited by the user; and
suggesting to the user the sites not visited by said user.
-
-
33. A method of organizing search engine results, comprising:
-
identifying a group of Web users having similar profiles;
recording search queries made by the users in the group and Web sites visited by users resulting from said search queries; and
for a Web user in the group making a search query, determining if the query was previously made by other users in the group and, if so, identifying to the user the Web sites visited by other users resulting from said search query. - View Dependent Claims (34, 35, 36, 37)
-
-
38. A method for clustering a plurality of data inputs into groups, comprising:
-
(a) designating a first data input as center of a group;
(b) analyzing another data input to determine if it is sufficiently close to a center of a group and, if so, assigning the data input to the group;
(c) if no group is found to be sufficiently close to the data input, defining a new group and assigning the data input to the new group; and
(d) repeating steps (b) and (c) until all data inputs have been assigned to groups.
-
Specification