News group clustering based on cross-post graph
First Claim
Patent Images
1. A computer implemented system that facilitates analyzing newsgroup similarity, comprising:
- one or more hardware processors;
memory coupled to the one or more hardware processors;
a data reception component, stored in the memory and executed by the one or more processors, that receives data relating to a plurality of newsgroups and cross-postings between the plurality of newsgroups;
a graphing engine, stored in the memory and executed by the one or more processors, that constructs a weighted graph with a subset of the newsgroups represented as vertices of the graph and cross-postings between two newsgroups of the subset of newsgroups represented as edges between vertices corresponding to the two newsgroups;
a filtering component, stored in the memory and executed by the one or more processors, that excludes particular newsgroups from being represented in the weighted graph so as to facilitate reducing a size of the weighted graph;
a paring component, stored in the memory and executed by the one or more processors, that removes edges of the graph with a weight less than a threshold weight so as to facilitate reducing the size of the graph;
a segmenting component, stored in the memory and executed by the one or more processors, that segments the weighted graph; and
a post-processing component, stored in the memory and executed by the one or more processors, that merges a first cluster of vertices and edges of the weighted graph into a second cluster of vertices and edges of the weighted graph if a sum of weights between the clusters is greater than a threshold.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and/or method that facilitates analyzing newsgroup clusters. A data reception component receives data relating to a plurality of newsgroups and relays the data to an engine that constructs a weighted graph. The weighted graph represents a subset of the newsgroups as vertices of the graph. The vertices are connected by edges, which represent cross-postings relating to the subset of newsgroups.
-
Citations
21 Claims
-
1. A computer implemented system that facilitates analyzing newsgroup similarity, comprising:
-
one or more hardware processors; memory coupled to the one or more hardware processors; a data reception component, stored in the memory and executed by the one or more processors, that receives data relating to a plurality of newsgroups and cross-postings between the plurality of newsgroups; a graphing engine, stored in the memory and executed by the one or more processors, that constructs a weighted graph with a subset of the newsgroups represented as vertices of the graph and cross-postings between two newsgroups of the subset of newsgroups represented as edges between vertices corresponding to the two newsgroups; a filtering component, stored in the memory and executed by the one or more processors, that excludes particular newsgroups from being represented in the weighted graph so as to facilitate reducing a size of the weighted graph; a paring component, stored in the memory and executed by the one or more processors, that removes edges of the graph with a weight less than a threshold weight so as to facilitate reducing the size of the graph; a segmenting component, stored in the memory and executed by the one or more processors, that segments the weighted graph; and a post-processing component, stored in the memory and executed by the one or more processors, that merges a first cluster of vertices and edges of the weighted graph into a second cluster of vertices and edges of the weighted graph if a sum of weights between the clusters is greater than a threshold. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer-implemented method for creating a cluster graph comprising the following computer executable steps:
-
receiving data relating to a plurality of newsgroups and cross-postings between the plurality of newsgroups; constructing, by a hardware processor, a weighted graph with a subset of the newsgroups represented as vertices of the graph and cross-postings between two newsgroups of the subset of newsgroups represented as edges between vertices corresponding to the two newsgroups; excluding particular newsgroups from being represented in the weighted graph so as to facilitate reducing a size of the weighted graph; removing edges of the graph with a weight less than a threshold weight so as to facilitate reducing the size of the graph; segmenting the weighted graph; and merging a first cluster of vertices and edges of the weighted graph into a second cluster of vertices and edges of the weighted graph if a sum of weights between the clusters is greater than a threshold. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. Computer storage media storing instructions that, when executed by a computing device, cause the computing device to perform acts comprising:
-
receiving data relating to a plurality of newsgroups and cross-postings between the plurality of newsgroups; constructing a weighted graph with a subset of the newsgroups represented as vertices of the graph and cross-postings between two newsgroups of the subset of newsgroups represented as edges between vertices corresponding to the two newsgroups; excluding particular newsgroups from being represented in the weighted graph so as to facilitate reducing a size of the weighted graph; removing edges of the graph with a weight less than a threshold weight so as to facilitate reducing the size of the graph; segmenting the weighted graph; and merging a first cluster of vertices and edges of the weighted graph into a second cluster of vertices and edges of the weighted graph if a sum of weights between the clusters is greater than a threshold. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
Specification