Probablistic models and methods for combining multiple content classifiers
First Claim
1. A computer system for classifying items, comprising:
- a plurality of classifiers;
a computer system component comprising probabilistic dependency models, one for each of a plurality of categories, the computer system component applies the probabilistic dependency models to an item to provide with respect to each of the plurality of categories an indication of whether the item belongs;
wherein the probabilistic dependency models collectively employ outputs from the plurality of classifiers; and
the outputs employed by the probabilistic dependency models vary among the probabilistic dependency models.
2 Assignments
0 Petitions
Accused Products
Abstract
The invention applies a probabilistic approach to combining evidence regarding the correct classification of items. Training data and machine learning techniques are used to construct probabilistic dependency models that effectively utilize evidence. The evidence includes the outputs of one or more classifiers and optionally one or more reliability indicators. The reliability indicators are, in a broad sense, attributes of the items being classified. These attributes can include characteristics of an item, source of an item, and meta-level outputs of classifiers applied to the item. The resulting models include meta-classifiers, which combine evidence from two or more classifiers, and tuned classifiers, which use reliability indicators to inform the interpretation of classical classifier outputs. The invention also provides systems and methods for identifying new reliability indicators.
128 Citations
30 Claims
-
1. A computer system for classifying items, comprising:
-
a plurality of classifiers;
a computer system component comprising probabilistic dependency models, one for each of a plurality of categories, the computer system component applies the probabilistic dependency models to an item to provide with respect to each of the plurality of categories an indication of whether the item belongs;
wherein the probabilistic dependency models collectively employ outputs from the plurality of classifiers; and
the outputs employed by the probabilistic dependency models vary among the probabilistic dependency models. - View Dependent Claims (2, 3, 4)
-
-
5. A computer system for classifying items, comprising:
-
a plurality of classifiers; and
,a computer system component that applies a probabilistic dependency model to classify an item, wherein the probabilistic dependency model contains dependencies on one or more classical outputs from the plurality of classifiers and dependencies on one or more reliability indicators. - View Dependent Claims (6, 7, 8)
-
-
9. A computer system, comprising:
-
a plurality of classifiers; and
,a first computer system component that learns, from training examples, probabilistic dependency models for classifying items according to one or more reliability indicators together with classical outputs from the plurality of classifiers. - View Dependent Claims (10, 11, 12, 13, 29)
-
-
14. A computer readable medium having computer executable instructions for performing steps comprising:
-
implementing a plurality of classifiers adapted to receive and classify at least one item, the plurality of classifiers each generating a score related to classification of the at least one item; and
for each of one or more categories, facilitating classification, selection, and/or utilization of the at least one item with a probabilistic dependency model that employs one or more of the scores and, in addition, one or more reliability indicators. - View Dependent Claims (15)
-
-
16. A system for classifying items, comprising:
-
means for determining a model that classifies the items based on a probabilistic approach that combines information about the items including one or more classical outputs of classifiers and one or more reliability indicators; and
means for applying the model to classify the items.
-
-
17. A computer-readable medium having stored thereon a data structure useful in classifying items, comprising:
-
first data fields containing data representing an attribute to test, wherein the attributes represented include both classical classifier outputs and reliability indicators;
second data fields corresponding to the first data fields and containing data representing values against which to compare the attributes;
third data fields containing data representing classifier outcomes;
fourth data fields facilitating determination of relationships among instances of the first, second, and third data fields, the relationships having a decision tree structure with the first and second data fields corresponding to decision nodes and the third data fields corresponding to leaf nodes. - View Dependent Claims (18)
-
-
19. A method of generating a classifier, comprising:
-
obtaining a set of training examples;
applying a probabilistic approach that uses the training examples to develop a model that combines evidence to provide an output relating to whether an item belongs in a category; and
storing the model in a computer-readable media for use as a classifier;
wherein the evidence comprises one or more classical outputs of other classifiers and one or more attributes of the item other than classical outputs of classifiers. - View Dependent Claims (20, 21, 22, 23, 30)
-
-
24. A method of classifying items, comprising:
-
applying probabilistic dependency models, one for each of a plurality of categories, to an item stored in computer readable format to provide an output relating to whether the item belongs in the category with respect to each of the plurality of categories;
wherein the probabilistic dependency models collectively contain dependencies on outputs from a plurality of classifiers; and
the outputs considered by the probabilistic dependency models vary among the probabilistic dependency models. - View Dependent Claims (25, 26)
-
-
27. A method of combining a plurality of classifiers to classify items, comprising:
-
sequentially applying tests to the items to obtain test results; and
classifying the items based on the test results, wherein the sequence of tests applied varies among the items in that the outcome of one or more tests affects whether another test is applied, whereby the classifiers utilized vary depending on the items. - View Dependent Claims (28)
-
Specification