QUERY LOG MINING FOR DETECTING SPAM-ATTRACTING QUERIES
First Claim
Patent Images
1. A method, comprising:
- generating one or more graphs using data obtained from a query log;
ascertaining values of one or more syntactic features of the one or more graphs;
determining values of one or more semantic features of the one or more graphs by propagating categories from a web directory among nodes in each of the one or more graphs; and
detecting spam-attracting queries based upon the values of the syntactic features and the semantic features.
3 Assignments
0 Petitions
Accused Products
Abstract
Disclosed are methods and apparatus for detecting spam-attracting queries. In one embodiment, one or more graphs are generated using data obtained from a query log, where the one or more graphs include at least one of an anticlick graph or a view graph. Values of one or more syntactic features of the one or more graphs are ascertained. Values of one or more semantic features of the one or more graphs are determined by propagating categories from a web directory among nodes in each of the one or more graphs. Spam-attracting queries are then detected based upon the values of the syntactic features and the semantic features.
-
Citations
20 Claims
-
1. A method, comprising:
-
generating one or more graphs using data obtained from a query log; ascertaining values of one or more syntactic features of the one or more graphs; determining values of one or more semantic features of the one or more graphs by propagating categories from a web directory among nodes in each of the one or more graphs; and detecting spam-attracting queries based upon the values of the syntactic features and the semantic features. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A computer-readable medium storing thereon computer-readable instructions, comprising:
-
instructions for generating one or more graphs using data obtained from a query log; instructions for propagating categories from a web directory among nodes in each of the one or more graphs; instructions for determining values of one or more semantic features of the one or more graphs after propagating categories among the nodes; and instructions for detecting spam-attracting queries based upon the values of the semantic features.
-
-
18. An apparatus, comprising:
-
a processor; and a memory, at least one of the processor or the memory being adapted for; generating one or more graphs using data obtained from a query log; ascertaining values of one or more features with respect to one or more query nodes of the one or more graphs; and detecting spam-attracting queries based upon the values of the features. - View Dependent Claims (19, 20)
-
Specification