CLUSTERING AGGREGATOR FOR RSS FEEDS
First Claim
Patent Images
1. A method for merging really simple syndication (RSS) feeds, comprising:
- (a) merging stories containing one or more terms into one or more clusters based on one or more links between the stories;
(b) determining a cluster frequency with which the terms occur in each cluster;
(c) determining a diameter for each cluster; and
(d) determining a cluster that is most similar to one of the clusters based on the cluster frequency; and
(e) merging the most similar cluster with the one of the clusters based on the diameter and the cluster frequency.
2 Assignments
0 Petitions
Accused Products
Abstract
A method for merging really simple syndication (RSS) feeds. Stories containing one or more terms may be merged into one or more clusters based on one or more links between the stories. A cluster frequency with which the terms occur in each cluster may be determined. A diameter for each cluster may be determined. A cluster that is most similar to one of the clusters may be determined based on the cluster frequency. The most similar cluster with the one of the clusters may be determined based on each diameter, and each cluster frequency.
-
Citations
20 Claims
-
1. A method for merging really simple syndication (RSS) feeds, comprising:
-
(a) merging stories containing one or more terms into one or more clusters based on one or more links between the stories; (b) determining a cluster frequency with which the terms occur in each cluster; (c) determining a diameter for each cluster; and (d) determining a cluster that is most similar to one of the clusters based on the cluster frequency; and (e) merging the most similar cluster with the one of the clusters based on the diameter and the cluster frequency. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computer-readable medium having stored thereon computer-executable instructions which, when executed by a computer, cause the computer to:
-
(a) determine a story frequency with which one or more terms occur in one or more stories of one or more really simple syndication (RSS) feeds; (b) merge the stories into one or more clusters based on one or more links between the stories; (c) determine a similarity between two linked stories based on the story frequency; (d) split the two linked stories into two different clusters based on the similarity; (e) determine a cluster frequency with which the terms occur in each cluster; (f) determine a diameter for each cluster; and (g) determine a cluster that is most similar to one of the clusters based on the cluster frequency; and (h) merge the most similar cluster with the one of the clusters based on the diameter and the cluster frequency. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18)
-
-
19. A computer system, comprising:
-
a processor; and a memory comprising program instructions executable by the processor to; (a) determine a term vector for each of one or more stories of one or more really simple syndication (RSS) feeds, the term vector comprising a weight for each term in the stories, and the weight being based on a term frequency and inverse document frequency algorithm; (b) merge the stories into one or more clusters based on one or more links between the stories; (c) determine a story cosine similarity between two linked stories based on each term vector of the two linked stories; (d) split the two linked stories into two different clusters based on the story cosine similarity; (e) determine a centroid vector for each cluster that is an average of each term vector for all stories within each cluster; (f) determine a diameter for each cluster; and (g) determine a cluster that is most similar to one of the clusters based on a cluster cosine similarity of a centroid vector of the cluster that is most similar to the one of the clusters and a centroid vector of the one of the clusters; and (h) merge the most similar cluster with the one of the clusters based on the diameter and the cluster cosine similarity. - View Dependent Claims (20)
-
Specification