Computer method and apparatus for determining content types of web pages
First Claim
1. A method of determining content type of contents of a subject Web page, comprising the steps of:
- providing a predefined set of potential content types;
for each potential content type, running tests having test results which enable quantitative evaluation of at least some contents of the subject Web page being of the potential content type;
mathematically combining the test results; and
based on the combined test results, assigning a respective probability, for each potential content type, that some contents of that type exists on the subject Web page.
7 Assignments
0 Petitions
Accused Products
Abstract
Computer method and apparatus determines content type of contents of a subject Web page. A predefined set of potential content types is first provided. For each potential content type, there are one or more tests having test results that enable quantitative evaluation of the contents of the subject Web page. A respective probability of each potential content type being detected in some contents of the subject Web page is determined. A Bayesian network combines the test results to provide indications of the types of contents detected on the subject Web page. A confidence level per detected content type is also provided. A database stores the determined probabilities and confidence levels, and thus provides a cross reference between Web pages and respective content types of contents found on the Web pages.
-
Citations
18 Claims
-
1. A method of determining content type of contents of a subject Web page, comprising the steps of:
-
providing a predefined set of potential content types;
for each potential content type, running tests having test results which enable quantitative evaluation of at least some contents of the subject Web page being of the potential content type;
mathematically combining the test results; and
based on the combined test results, assigning a respective probability, for each potential content type, that some contents of that type exists on the subject Web page. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. Apparatus for determining content type of contents of a subject Web page, comprising:
-
a predefined set of potential content types; and
a test module utilizing the predefined set, the test module employing a plurality of processor-executed tests having test results which enable, for each potential content type, quantitative evaluation of at least some contents of the subject Web page being of the potential content type, for each potential content type, the test module (i) running at least a subset of the tests, (ii) combining the test results and (iii) for each potential content type, assigning a respective probability that at least some contents of that type exists on the subject Web page being of the potential content type. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
Specification