Training apparatus and method

US 5,671,333 A
Filed: 04/07/1994
Issued: 09/23/1997
Est. Priority Date: 04/07/1994
Status: Expired due to Fees

First Claim

Patent Images

1. An apparatus for interpreting data, comprising:

a current first classifier operative to interpret a plurality of actual examples of the data and to output an interpretation of each interpreted example and a certainty value associated with each interpretation wherein the current first classifier comprises a chooser operative to discriminate between certain ones of the outputted interpretations having respective high certainty values and uncertain ones of the outputted interpretations having respective low certainty values and to select and output each of the actual examples associated with a respective uncertain one of the interpretations;

a second classifier operative to annotate each of the interpreted examples associated with the selected uncertain ones of the interpretations and to output a preferred interpretation for each interpreted example associated with the selected uncertain ones of the interpretations; and

an uncertainty measuring device generator operative to produce a next first classifier by utilizing at least one annotated example and its associated preferred interpretation, the next first classifier capable of interpreting subsequent actual examples of the data more accurately than the current first classifier.

View all claims

6 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Apparatus and methods for training classifiers. The apparatus includes a degree of certainty classifier which classifies examples in categories and indicates a degree of certainty regarding the classification and an annotating classifier which receives classified examples with a low degree of certainty from the degree of certainty classifier and annotates them to indicate whether their classifications are correct. The annotated examples are then used to train another classifier. In one version of the invention, the other classifier is a new version of the degree of certainty classifier, and training continues until the degree of certainty classifier has satisfactory performance. The degree of certainty classifier of the embodiment is a probabilistic binary classifier which is trained using relevance feedback. The annotating classifier may include an interactive interface which permits a human user of the system to examine an example and indicate whether it was properly classified.

130 Citations

15 Claims

1. An apparatus for interpreting data, comprising:
- a current first classifier operative to interpret a plurality of actual examples of the data and to output an interpretation of each interpreted example and a certainty value associated with each interpretation wherein the current first classifier comprises a chooser operative to discriminate between certain ones of the outputted interpretations having respective high certainty values and uncertain ones of the outputted interpretations having respective low certainty values and to select and output each of the actual examples associated with a respective uncertain one of the interpretations;
  
  a second classifier operative to annotate each of the interpreted examples associated with the selected uncertain ones of the interpretations and to output a preferred interpretation for each interpreted example associated with the selected uncertain ones of the interpretations; and
  
  an uncertainty measuring device generator operative to produce a next first classifier by utilizing at least one annotated example and its associated preferred interpretation, the next first classifier capable of interpreting subsequent actual examples of the data more accurately than the current first classifier.
- View Dependent Claims (2, 3, 8, 9, 10, 11, 12, 15)
- - 2. The apparatus set forth in claim 1 further comprising an uncertainty measuring device computes an estimate of the entropy of the classification to each unannotated data portion.
  - 3. The apparatus set forth in claim 1, wherein:
    - said examples are divided into data portions by a segmenter.
  - 8. The apparatus set forth in claim 1, claim 4 or claim 7 wherein, the data is text.
  - 9. The apparatus set forth in claim 1, claim 4, or claim 7 wherein:
    - the data is text.
  - 10. The apparatus set forth in claim 1, claim 4, or claim 7 wherein:
    - the second classifier includes interactive means for providing the example to a human judge and receiving the annotation from the judge.
  - 11. The apparatus set forth in claim 1, claim 4 or claim 7 wherein:
    - the uncertainty measuring device generator includes means for using relevance feedback to produce the next first classifier.
  - 12. The apparatus set forth in claim 1, claim 4 or claim 7 wherein:
    - the classifier classifies data into classes;
      
      the uncertainty measuring device generator includes;
      
      means for producing a class membership score for each annotated uncertain example, andmeans for using logistic regression to produce a parameter for each annotated uncertain example which modifies the class membership score to indicate a probability that the annotated uncertain example belongs to the class.
  - 15. The apparatus set forth in claim 1, claim 4 or claim 7, wherein each of the respective low certainty values is less than a lowest one of the respective high certainty values.

4. An apparatus for interpreting data, comprising:
- a first classifier operative to interpret a plurality of actual examples of the data and to output an interpretation of each interpreted example and a certainty value associated with each interpretation wherein the first classifier comprises a chooser operative to discriminate between certain ones of the outputted interpretations having respective high certainty values and uncertain ones of the outputted interpretations having respective low certainty values and to select and output each of the interpreted examples associated with a respective uncertain one of the interpretations;
  
  a second classifier operative to annotate each of the interpreted examples associated with the selected uncertain ones of the interpretations and to output a preferred interpretation for each interpreted example associated with the selected uncertain ones of the interpretations; and
  
  an uncertainty measuring device generator operative to produce a next first classifier by using at least one annotated example and its associated preferred interpretation, the next first classifier capable of interpreting subsequent actual examples of the data more accurately than the first classifier.
- View Dependent Claims (5, 6)
- - 5. The apparatus of claim 4 further comprising an uncertainty measuring device computes an estimate of the entropy of the classification to each unannotated data portion.
  - 6. The apparatus of claim 4, wherein:
    - said examples are divided into data portions by a segmenter.

7. An apparatus for interpreting data, comprising:
- a first classifier operative to interpret a plurality of actual examples of the data according to a first principle and to output an interpretation of each interpreted example and a certainty value associated with each interpretation wherein the first classifier comprises a chooser operative to discriminate between certain ones of the outputted interpretations having respective high certainty values and uncertain ones of the outputted interpretations having a low certainty values and to select and output each of the interpreted examples associated with a respective uncertain one of the interpretations;
  
  a second classifier operative to annotate each of the interpreted examples associated with the selected uncertain ones of the interpretations and to output a preferred interpretation for each interpreted example associated with the selected uncertain ones of the interpretations; and
  
  an uncertainty measuring device generator operative to produce a third classifier by utilizing at least one annotated example and its associated preferred interpretation, the third classifier operative to interpret subsequent actual examples of the data according to a second principle different from the first principle.
- View Dependent Claims (13, 14)
- - 13. The apparatus of claim 7 further comprising an uncertainty measuring device computes an estimate of the entropy of the classification to each unannotated data portion.
  - 14. The apparatus of claim 7, wherein:
    - said examples are divided into data portions by a segmenter.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Lucent Technologies, Inc. (Nokia Corporation)
Original Assignee
Lucent Technologies, Inc. (Nokia Corporation)
Inventors
Catlett, Jason A., Lewis, David Dolan, Gale, William Arthur
Primary Examiner(s)
Hafiz, Tariq R.

Application Number

US08/224,599
Time in Patent Office

1,265 Days
Field of Search

395/11, 395/20-23, 395/10, 395/25, 395/27, 382/155-163
US Class Current

706/12
CPC Class Codes

G06N 20/00 Machine learning

Training apparatus and method

First Claim

6 Assignments

0 Petitions

Accused Products

Abstract

130 Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Training apparatus and method

First Claim

6 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

130 Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links