Overlapping Community Detection in Weighted Graphs
First Claim
1. A computer-implemented method comprising:
- identifying, using one or more computing devices, a context defining a tagset;
determining, using the one or more computing devices, a plurality of tagsets each including one or more of the tags and a vocabulary of unique tags defined by the identified context;
generating, using the one or more computing devices, counts statistics using the plurality of tagsets and the vocabulary of unique tags;
generating, using the one or more computing devices, a weighted tag co-occurrence graph including each pair of tags in the vocabulary of unique tags based on the count statistics;
denoising, using the one or more computing devices, the weighted tag co-occurrence graph; and
responsive to removing the noise, identifying, using the one or more computing devices, at least one community in the weighted tag co-occurrence graph.
2 Assignments
0 Petitions
Accused Products
Abstract
The disclosure includes a system and method for detecting communities in a weighted graph. The community detection module includes a tagset data aggregator, a counts statistics engine, a weighted graph generator, a coherence engine, a community detector and a tag recommendation engine. The tagset data aggregator receives tagset data. The counts statistics engine determines counts statistics for the tagset data. The weighted graph generator generates and denoises weighted tag occurrence graph based on the counts statistics. The coherence engine determines importance score for all tags and coherence score for all tagsets in the tagset data. The community detector determines maximally coherent communities in the weighted tag co-occurrence graph. The tag recommendation engine recommends tags in real time using the maximally coherent communities.
15 Citations
20 Claims
-
1. A computer-implemented method comprising:
-
identifying, using one or more computing devices, a context defining a tagset; determining, using the one or more computing devices, a plurality of tagsets each including one or more of the tags and a vocabulary of unique tags defined by the identified context; generating, using the one or more computing devices, counts statistics using the plurality of tagsets and the vocabulary of unique tags; generating, using the one or more computing devices, a weighted tag co-occurrence graph including each pair of tags in the vocabulary of unique tags based on the count statistics; denoising, using the one or more computing devices, the weighted tag co-occurrence graph; and responsive to removing the noise, identifying, using the one or more computing devices, at least one community in the weighted tag co-occurrence graph. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system comprising:
one or more processors, the one or more processors being configured to; identify a context defining a tagset; determine a plurality of tagsets each including one or more of the tags and a vocabulary of unique tags of the tagset defined by the identified context; generate counts statistics using the plurality of tagsets and the vocabulary of all tags; generate a weighted tag co-occurrence graph including each pair of tags in the vocabulary of unique tags based on the count statistics; denoise the weighted tag co-occurrence graph; and responsive to removal of the noise, identify at least one community in the weighted tag co-occurrence graph. - View Dependent Claims (12, 13, 14, 15)
-
16. A computer program product comprising a non-transitory computer usable medium including a computer readable program, wherein the computer readable program when executed on a computer causes the computer to perform steps comprising:
-
identifying, using one or more computing devices, a context defining a tagset; determining, using the one or more computing devices, a plurality of tagsets each including one or more of the tags and a vocabulary of unique tags of the tagset defined by the identified context; generating, using the one or more computing devices, counts statistics using the plurality of tagsets and the vocabulary of unique tags; generating, using the one or more computing devices, a weighted tag co-occurrence graph including each pair of tags in the vocabulary of unique tags based on the count statistics; denoising, using the one or more computing devices, the weighted tag co-occurrence graph; and responsive to removing the noise, identifying, using the one or more computing devices, at least one community in the weighted tag co-occurrence graph. - View Dependent Claims (17, 18, 19, 20)
-
Specification