System and method for characterizing a web page using multiple anchor sets of web pages
First Claim
1. A computer system for characterizing a web page, comprising:
- a characterization engine for characterizing a web page;
a probability distribution engine operably coupled to the characterization engine for generating a probability distribution over the vertices of a graph representing a collection of web pages; and
a characterization measure analyzer operably coupled to the characterization engine for determining a quality measure for the web page using the probability distribution.
9 Assignments
0 Petitions
Accused Products
Abstract
An improved system and method is provided for characterizing a web page using multiple anchor sets of web pages. To do so, web pages in a collection of unknown web pages may be characterized using known anchor sets of web pages with different characterizations that may be linked to the collection of unknown web pages. A direction and method may be selected for propagating a probability distribution between vertices of a graph representing the collection of web pages and vertices of the anchor sets representing the anchor sets of web pages. Methods for propagating the probability distribution in a forward, backward or bidirectional direction are provided. Various quality measures of the characterization of the vertices are provided using the propagated probability distribution. These various quality measures may be paired and combined in different ways to provide a characterization of the vertices representing the unknown web pages.
53 Citations
20 Claims
-
1. A computer system for characterizing a web page, comprising:
-
a characterization engine for characterizing a web page; a probability distribution engine operably coupled to the characterization engine for generating a probability distribution over the vertices of a graph representing a collection of web pages; and a characterization measure analyzer operably coupled to the characterization engine for determining a quality measure for the web page using the probability distribution. - View Dependent Claims (2, 3, 4)
-
-
5. A computer-implemented method for characterizing a web page, comprising:
-
receiving a plurality of anchor sets of web pages; determining a quality measure of a characterization of a web page using the plurality of anchor sets of web pages; and outputting an indication of the characterization of the web page. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A computer system for characterizing a web page, comprising:
-
means for receiving an anchor set of web pages with a known characterization; means for determining a quality measure of a characterization of a web page using the anchor set of web pages; and means for outputting an indication of the characterization of the web page. - View Dependent Claims (18, 19, 20)
-
Specification