QUERY LOG MINING FOR DETECTING SPAM HOSTS
First Claim
Patent Images
1. A method, comprising:
- generating one or more graphs using data obtained from a query log, the one or more graphs including at least one of an anticlick graph or a view graph;
ascertaining values of one or more syntactic features of the one or more graphs;
determining values of one or more semantic features of the one or more graphs by propagating categories from a web directory among nodes in each of the one or more graphs; and
detecting spam hosts based upon the values of the syntactic features and the semantic features.
10 Assignments
0 Petitions
Accused Products
Abstract
Disclosed are methods and apparatus for detecting spam hosts. In one embodiment, one or more graphs are generated using data obtained from a query log, where the one or more graphs include at least one of an anticlick graph or a view graph. Values of one or more syntactic features of the one or more graphs are ascertained. Values of one or more semantic features of the one or more graphs are determined by propagating categories from a web directory among nodes in each of the one or more graphs. Spam hosts are then detected based upon the values of the syntactic features and the semantic features.
-
Citations
20 Claims
-
1. A method, comprising:
-
generating one or more graphs using data obtained from a query log, the one or more graphs including at least one of an anticlick graph or a view graph; ascertaining values of one or more syntactic features of the one or more graphs; determining values of one or more semantic features of the one or more graphs by propagating categories from a web directory among nodes in each of the one or more graphs; and detecting spam hosts based upon the values of the syntactic features and the semantic features. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A computer-readable medium storing thereon computer-readable instructions, comprising:
-
instructions for generating one or more graphs using data obtained from a query log, the one or more graphs including at least one of an anticlick graph or a view graph; instructions for propagating categories from a web directory among nodes in each of the one or more graphs; instructions for determining values of one or more semantic features of the one or more graphs after propagating categories among the nodes; and instructions for detecting spam hosts based upon the values of the semantic features.
-
-
18. An apparatus, comprising:
-
a processor; and a memory, at least one of the processor or the memory being adapted for; generating one or more graphs using data obtained from a query log, the one or more graphs including at least one of an anticlick graph or a view graph; ascertaining values of one or more syntactic features of the one or more graphs; propagating categories from a web directory among nodes in each of the one or more graphs; determining values of one or more semantic features of the one or more graphs; and detecting spam hosts using the values of the syntactic features and the semantic features. - View Dependent Claims (19, 20)
-
Specification