System and method for clustering large lists into optimal segments
First Claim
1. A method for clustering data wherein the data includes a plurality of items, said method comprising:
- obtaining requested cluster size data;
calculating a score for each item; and
determining a cluster size based upon the calculated score.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for clustering data from a server computer and sent to a client computer. The server computer obtains a requested cluster size for the client computer. The requested cluster size includes the optimal size cluster the client computer can handle and the largest manageable cluster size that can be handled by the client. Fuzzy logic computations are performed on the data to determine an optimal cluster size and an optimal point at which to split the data for the particular client. Part of the cluster computations are based upon the affinity of individual data items to adjacent data items in the clustered list. The server computer also checks the affinity between the item with the largest score and the first item in the next cluster. If this affinity is higher than other affinity scores within the cluster, the cluster split is moved accordingly. Once an optimal cluster is determined, the data is transmitted from the server computer to the client computer.
32 Citations
24 Claims
-
1. A method for clustering data wherein the data includes a plurality of items, said method comprising:
-
obtaining requested cluster size data;
calculating a score for each item; and
determining a cluster size based upon the calculated score. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
calculating an affinity score for each item, the affinity score relating to the similarity of each item to an adjacent item in the data.
-
-
4. The method of claim 3, wherein the calculating a score for each item further comprises:
-
calculating a fuzzy optimal score for each item, the fuzzy optimal score relating to an optimal cluster size usable by a client computer; and
calculating a fuzzy maximum score for each item, the fuzzy maximum score relating to a largest manageable cluster size usable by a client computer.
-
-
5. The method of claim 4, wherein the calculating a score further comprises:
-
calculating a total score for each item, the total score determined as the product of the affinity score for the item, the fuzzy optimal score for the item, and the fuzzy maximum score for the item.
-
-
6. The method of claim 5, wherein the determining a cluster size includes selecting a last cluster item from the plurality of items wherein the last cluster item includes a largest total score.
-
7. The method of claim 6, wherein the determining a cluster size further comprises:
-
comparing the affinity score for the last cluster item with the affinity score for each item; and
selecting a new last cluster item, the last cluster item having a greater affinity score than the affinity score for the last cluster item.
-
-
8. The method of claim 1, wherein the obtaining further comprises:
-
connecting a server computer to a computer network; and
receiving at the server computer the requested cluster size data from a client computer.
-
-
9. An information handling system for clustering data wherein the data includes a plurality of items, said system comprising:
-
a computer, the computer including;
one or more processing units;
a memory operatively coupled to the one or more processing units; and
a nonvolatile storage area where the data is stored;
a program executable by the one or more processing units, the program including;
software code programmed to obtain requested cluster size data;
software code programmed to calculate a score for each item; and
software code programmed to determine a cluster size based upon the score for each item. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
software code to calculate an affinity score for each item, the affinity score relating to the similarity of each item to an adjacent item in the data.
-
-
12. The information handling system of claim 11, wherein the software code programmed to calculate a score for each item further comprises:
-
software code to calculate a fuzzy optimal score for each item, the fuzzy optimal score relating to an optimal cluster size usable by a client computer; and
software code to calculate a fuzzy maximum score for each item, the fuzzy maximum score relating to a largest manageable cluster size usable by a client computer.
-
-
13. The information handling system of claim 12, wherein the software code to calculate a score further comprises:
software code to calculate a total score for each item, the total score determined as the product of the affinity score for the item, the fuzzy optimal score for the item, and the fuzzy maximum score for the item.
-
14. The information handling system of claim 13, wherein the software code to determine a cluster size includes software code to select a last cluster item from the plurality of items wherein the last cluster item includes a largest total score.
-
15. The information handling system of claim 14, wherein the software code to determine a cluster size further comprises:
-
software code to compare the affinity score for the last cluster item with the affinity score for each item; and
software code to select a new last cluster item, the last cluster item having a greater affinity score than the affinity score for the last cluster item.
-
-
16. The information handling system of claim 9, wherein the software code to obtain further comprises:
-
software code to connect the computer to a computer network; and
software code to receive at the server computer the requested cluster size data from a client computer.
-
-
17. A computer operable medium for clustering data wherein the data includes a plurality of items, said medium comprising:
-
means for obtaining requested cluster size data;
means for calculating a score for each item; and
means for determining a cluster size based upon the calculating. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24)
means for calculating an affinity score for each item, the affinity score relating to the similarity of each item to an adjacent item in the data.
-
-
20. The computer operable medium of claim 19, wherein the means for calculating a score for each item further comprises:
-
means for calculating a fuzzy optimal score for each item, the fuzzy optimal score relating to an optimal cluster size usable by a client computer; and
means for calculating a fuzzy maximum score for each item, the fuzzy maximum score relating to a largest manageable cluster size usable by a client computer.
-
-
21. The computer operable medium of claim 20, wherein the means for calculating a score further comprises:
means for calculating a total score for each item, the total score determined as the product of the affinity score for the item, the fuzzy optimal score for the item, and the fuzzy maximum score for the item.
-
22. The computer operable medium of claim 21, wherein the means for determining a cluster size includes means for selecting a last cluster item from the plurality of items wherein the last cluster item includes a largest total score.
-
23. The computer operable medium of claim 22, wherein the means for determining a cluster size further comprises:
-
means for comparing the affinity score for the last cluster item with the affinity score for each item; and
means for selecting a new last cluster item, the last cluster item having a greater affinity score than the affinity score for the last cluster item.
-
-
24. The computer operable medium of claim 17, wherein the means for obtaining further comprises:
-
means for connecting a server computer to a computer network; and
means for receiving at the server computer the requested cluster size data from a client computer.
-
Specification