SYSTEM AND METHOD FOR PR0BABILISTIC RELATIONAL CLUSTERING
First Claim
1. A method of detection of a community in a network, comprising:
- automatically optimizing an unsupervised mixed membership relational clustering model based on at least respective relationships between a plurality of interrelated data objects, dependent on different latent classes having respective latent class membership parameters, by maximizing a likelihood function to estimate unknown parameters of a joint probability distribution over latent indicators of the plurality of interrelated data objects having at least one type of data associated with different latent classes, having at least one of respective data object attributes, homogeneous relations between the respective data object and data objects having the same type, and heterogeneous relations between the respective data object and data objects having different types, and observations of the plurality of data object attributes;
clustering the interrelated plurality of data objects according to the optimized unsupervised mixed membership relational clustering model;
wherein the plurality of interrelated data objects comprise a set of web documents, wherein the respective data object attributes comprise a web document text and the relations between respective data objects comprise link information; and
responding to a web search query based on the clustering.
3 Assignments
0 Petitions
Accused Products
Abstract
Relational clustering has attracted more and more attention due to its phenomenal impact in various important applications which involve multi-type interrelated data objects, such as Web mining, search marketing, bioinformatics, citation analysis, and epidemiology. A probabilistic model is presented for relational clustering, which also provides a principal framework to unify various important clustering tasks including traditional attributes-based clustering, semi-supervised clustering, co-clustering and graph clustering. The model seeks to identify cluster structures for each type of data objects and interaction patterns between different types of objects. Under this model, parametric hard and soft relational clustering algorithms are provided under a large number of exponential family distributions. The algorithms are applicable to relational data of various structures and at the same time unify a number of state-of-the-art clustering algorithms: co-clustering algorithms, the k-partite graph clustering, and semi-supervised clustering based on hidden Markov random fields.
37 Citations
20 Claims
-
1. A method of detection of a community in a network, comprising:
-
automatically optimizing an unsupervised mixed membership relational clustering model based on at least respective relationships between a plurality of interrelated data objects, dependent on different latent classes having respective latent class membership parameters, by maximizing a likelihood function to estimate unknown parameters of a joint probability distribution over latent indicators of the plurality of interrelated data objects having at least one type of data associated with different latent classes, having at least one of respective data object attributes, homogeneous relations between the respective data object and data objects having the same type, and heterogeneous relations between the respective data object and data objects having different types, and observations of the plurality of data object attributes; clustering the interrelated plurality of data objects according to the optimized unsupervised mixed membership relational clustering model; wherein the plurality of interrelated data objects comprise a set of web documents, wherein the respective data object attributes comprise a web document text and the relations between respective data objects comprise link information; and responding to a web search query based on the clustering. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method of detection of a community in a network, comprising:
-
automatically optimizing an unsupervised mixed membership relational clustering model based on at least respective relationships between a plurality of interrelated data objects, dependent on different latent classes having respective latent class membership parameters, by maximizing a likelihood function to estimate unknown parameters of a joint probability distribution over latent indicators of the plurality of interrelated data objects having at least one type of data associated with different latent classes, having at least one of respective data object attributes, homogeneous relations between the respective data object and data objects having the same type, and heterogeneous relations between the respective data object and data objects having different types, and observations of the plurality of data object attributes; clustering the interrelated plurality of data objects according to the optimized unsupervised mixed membership relational clustering model; wherein the plurality of interrelated data objects comprise a set of media objects; and providing a media recommendation based on the clustering. - View Dependent Claims (8, 9, 10, 11, 12, 13, 20)
-
-
14. A method of detection of a community in a network, comprising:
-
automatically optimizing an unsupervised mixed membership relational clustering model based on at least respective relationships between a plurality of interrelated data objects, dependent on different latent classes having respective latent class membership parameters, by maximizing a likelihood function to estimate unknown parameters of a joint probability distribution over latent indicators of the plurality of interrelated data objects having at least one type of data associated with different latent classes, having at least one of respective data object attributes, homogeneous relations between the respective data object and data objects having the same type, and heterogeneous relations between the respective data object and data objects having different types, and observations of the plurality of data object attributes;
clustering the interrelated plurality of data objects according to the optimized unsupervised mixed membership relational clustering model;wherein the plurality of interrelated data objects comprise a set of social network data objects, and relations comprise social links; and detecting a social community within the social network data objects, based on the clustering. - View Dependent Claims (15, 16, 17, 18, 19)
-
Specification