Overlapping community detection in weighted graphs
First Claim
1. A computer-implemented method comprising:
- identifying, using one or more computing devices, a context;
determining, using the one or more computing devices, a plurality of tagsets each including one or more tags describing an entity and a vocabulary of unique tags defined by the identified context;
generating, using the one or more computing devices, counts statistics using the plurality of tagsets and the vocabulary of unique tags;
determining a measure of co-occurrence consistent for a pair of tags in the vocabulary of unique tags based on the counts statistics, the measure of co-occurrence consistent indicating a likelihood of the pair of tags co-occurring in a tagset from the plurality of tagsets relative to random;
generating, using the one or more computing devices, a weighted tag co-occurrence graph including the pair of tags in the vocabulary of unique tags based on the measure of co-occurrence consistent;
denoising, using the one or more computing devices, the weighted tag co-occurrence graph; and
responsive to removing the noise, identifying, using the one or more computing devices, at least one community in the weighted tag co-occurrence graph.
2 Assignments
0 Petitions
Accused Products
Abstract
The disclosure includes a system and method for detecting communities in a weighted graph. The community detection module includes a tagset data aggregator, a counts statistics engine, a weighted graph generator, a coherence engine, a community detector and a tag recommendation engine. The tagset data aggregator receives tagset data. The counts statistics engine determines counts statistics for the tagset data. The weighted graph generator generates and denoises weighted tag occurrence graph based on the counts statistics. The coherence engine determines importance score for all tags and coherence score for all tagsets in the tagset data. The community detector determines maximally coherent communities in the weighted tag co-occurrence graph. The tag recommendation engine recommends tags in real time using the maximally coherent communities.
2 Citations
17 Claims
-
1. A computer-implemented method comprising:
-
identifying, using one or more computing devices, a context; determining, using the one or more computing devices, a plurality of tagsets each including one or more tags describing an entity and a vocabulary of unique tags defined by the identified context; generating, using the one or more computing devices, counts statistics using the plurality of tagsets and the vocabulary of unique tags; determining a measure of co-occurrence consistent for a pair of tags in the vocabulary of unique tags based on the counts statistics, the measure of co-occurrence consistent indicating a likelihood of the pair of tags co-occurring in a tagset from the plurality of tagsets relative to random; generating, using the one or more computing devices, a weighted tag co-occurrence graph including the pair of tags in the vocabulary of unique tags based on the measure of co-occurrence consistent; denoising, using the one or more computing devices, the weighted tag co-occurrence graph; and responsive to removing the noise, identifying, using the one or more computing devices, at least one community in the weighted tag co-occurrence graph. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system comprising:
one or more processors, the one or more processors being configured to; identify a context; determine a plurality of tagsets each including one or more tags describing an entity and a vocabulary of unique tags defined by the identified context; generate counts statistics using the plurality of tagsets and the vocabulary of unique tags; determine a measure of co-occurrence consistency for a pair of tags in the vocabulary of unique tags based on the counts statistics, the measure of co-occurrence consistency indicating a likelihood of the pair of tags co-occurring in a tagset from the plurality of tagsets relative to random; generate a weighted tag co-occurrence graph including the pair of tags in the vocabulary of unique tags based on the measure of co-occurrence consistency; denoise the weighted tag co-occurrence graph; and responsive to removal of the noise, identify at least one community in the weighted tag co-occurrence graph. - View Dependent Claims (11, 12, 13)
-
14. A computer program product comprising a non-transitory computer usable medium including a computer readable program, wherein the computer readable program when executed on a computer causes the computer to perform steps comprising:
-
identifying, using one or more computing devices, a context; determining, using the one or more computing devices, a plurality of tagsets each including one or more tags describing an entity and a vocabulary of unique tags defined by the identified context; generating, using the one or more computing devices, counts statistics using the plurality of tagsets and the vocabulary of unique tags; determining a measure of co-occurrence consistency for a pair of tags in the vocabulary of unique tags based on the counts statistics, the measure of co-occurrence consistency indicating a likelihood of the pair of tags co-occurring in a tagset from the plurality of tagsets relative to random; generating, using the one or more computing devices, a weighted tag co-occurrence graph including the pair of tags in the vocabulary of unique tags based on the measure of co-occurrence consistency; denoising, using the one or more computing devices, the weighted tag co-occurrence graph; and responsive to removing the noise, identifying, using the one or more computing devices, at least one community in the weighted tag co-occurrence graph. - View Dependent Claims (15, 16, 17)
-
Specification