Content evaluation
First Claim
1. A method for evaluating content, comprising:
- generating a data set using an attribute associated with the content;
evaluating the data set using a statistical distribution to identify a class of statistical outliers; and
analyzing a web page to determine whether it is part of the class of statistical outliers.
2 Assignments
0 Petitions
Accused Products
Abstract
Evaluating content is described, including generating a data set using an attribute associated with the content, evaluating the data set using a statistical distribution to identify a class of statistical outliers, and analyzing a web page to determine whether it is part of the class of statistical outliers. A system includes a memory configured to store data, and a processor configured to generate a data set using an attribute associated with the content, evaluate the data set using a statistical distribution to identify a class of statistical outliers, and analyze a web page to determine whether it is part of the class of statistical outliers. Another technique includes crawling a set of web pages, evaluating the set of web pages to compute a statistical distribution, flagging an outlier page in the statistical distribution as web spam, and creating an index of the web pages and the outlier page for answering a query.
125 Citations
29 Claims
-
1. A method for evaluating content, comprising:
-
generating a data set using an attribute associated with the content;
evaluating the data set using a statistical distribution to identify a class of statistical outliers; and
analyzing a web page to determine whether it is part of the class of statistical outliers. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
-
-
27. A method for evaluating content, comprising:
-
crawling a set of web pages;
evaluating the set of web pages to compute a statistical distribution;
flagging an outlier page in the statistical distribution as web spam; and
creating an index of the web pages and the outlier page for answering a query.
-
-
28. A system for evaluating content, comprising:
-
a memory configured to store data; and
a processor configured to generate a data set using an attribute associated with the content, evaluate the data set using a statistical distribution to identify a class of statistical outliers, and analyze a web page to determine whether it is part of the class of statistical outliers.
-
-
29. A computer program product for evaluating content, the computer program product being embodied in a computer readable medium and comprising computer instructions for:
-
generating a data set using an attribute associated with the content;
evaluating the data set using a statistical distribution to identify a class of statistical outliers; and
analyzing a web page to determine whether it is part of the class of statistical outliers.
-
Specification