Method for training a statistical classifier with reduced tendency for overfitting

US 5,903,884 A
Filed: 08/08/1995
Issued: 05/11/1999
Est. Priority Date: 08/08/1995
Status: Expired due to Term

First Claim

Patent Images

1. A method for training a statistical classifier to recognize input patterns that belong to respective predetermined classes, utilizing a set of training samples which are respectively associated with said classes, comprising the following steps which are repeated over a large number of iterations:

selecting a training sample from said set of training samples;

producing a set of distortion parameters;

selectively distorting said training sample in accordance with said distortion parameters to compute a classifier input pattern; and

training the classifier using said classifier input pattern.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

To prevent overfitting a neural network to a finite set of training samples, random distortions are dynamically applied to the samples each time they are applied to the network during a training session. A plurality of different types of distortions can be applied, which are randomly selected each time a sample is applied to the network. Alternatively, a combination of two or more types of distortion can be applied each time, with the amount of distortion being randomly varied for each type.

Citations

24 Claims

1. A method for training a statistical classifier to recognize input patterns that belong to respective predetermined classes, utilizing a set of training samples which are respectively associated with said classes, comprising the following steps which are repeated over a large number of iterations:
- selecting a training sample from said set of training samples;
  
  producing a set of distortion parameters;
  
  selectively distorting said training sample in accordance with said distortion parameters to compute a classifier input pattern; and
  
  training the classifier using said classifier input pattern.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
- - 2. The training method of claim 1 wherein said distortion parameters are produced according to an effectively random process.
  - 3. The training method of claim 1 wherein said distortion parameters are produced according to a deterministic pattern.
  - 4. The training method of claim 1 wherein said distorting step further comprises the steps of:
    - computing said classifier input pattern from said training sample, substantially independent of said distortion parameters; and
      
      distorting said classifier input pattern in accordance with said distortion parameters.
  - 5. The method of claim 4 wherein the step of distorting said classifier input pattern comprises distorting images of said patterns.
  - 6. The training method of claim 1 wherein said distorting step further comprises the steps of:
    - distorting a first data set representative of said training sample in accordance with said distortion parameters; and
      
      converting said first data set to a second data set representative of said classifier input pattern.
  - 7. The method of claim 6 wherein said first data set comprises x and y coordinates representative of handwritten characters and the step of distorting said first data set comprises distorting said x and y coordinate values.
  - 8. The method of claim 1 wherein said set of distortion parameters comprise plural types of distortion.
  - 9. The method of claim 8 wherein said step of producing said set of distortion parameters includes the step of selecting one or more types of distortion.
  - 10. The method of claim 9 further including the step of randomly selecting an amount of distortion for each type of distortion to be applied to said images during each iteration.
  - 11. The method of claim 9 wherein one of said types of distortion comprises varying the aspect ratio of an image.
  - 12. The method of claim 9 wherein one of said types of distortion comprises rotating an image.
  - 13. The method of claim 1 wherein said training sample is repeatedly used to train the classifier over multiple iterations, and a new set of distortion parameters is produced for each iteration to dynamically compute different classifier input patterns for each iteration.
  - 14. The method of claim 1 wherein a plurality of training samples are successively selected from said set of training samples, and a new set of distortion parameters is produced for each selected sample.

15. A method for training a statistical classifier to recognize input patterns that belong to respective predetermined classes, comprising the steps of:
- (i) generating a set of training sample patterns which are respectively associated with said classes;
  
  (ii) processing sample patterns from said set in the classifier to generate output values;
  
  (iii) determining error values based on differences between said output values and target values associated with said sample patterns;
  
  (iv) adjusting operating parameters of the classifier in accordance with said error values;
  
  (v) iteratively repeating steps (ii)-(iv) with the sample patterns from said set; and
  
  (vi) modifying said patterns prior to processing them in the classifier, during successive iterations of steps (ii)-(iv).
- View Dependent Claims (16, 17, 18, 19, 20, 21, 22)
- - 16. The method of claim 15 wherein the step of modifying said patterns comprises distorting images of said patterns.
  - 17. The method of claim 16 wherein plural types of distortion are applied to said images.
  - 18. The method of claim 17 further including the step of randomly selecting one or more types of distortions to be applied to said images during each iteration.
  - 19. The method of claim 17 further including the step of randomly selecting an amount of distortion for each type of distortion to be applied to said images during each iteration.
  - 20. The method of claim 17 wherein one of said types of distortion comprises varying the aspect ratio of an image.
  - 21. The method of claim 17 wherein one of said types of distortion comprises rotating an image.
  - 22. The method of claim 17 wherein one of said types of distortion comprises varying the angular orientation of an image.

23. A method for training a statistical classifier to recognize input patterns that belong to respective predetermined classes, utilizing a set of training samples which are respectively associated with said classes, comprising the following steps:
- selecting a first training sample from said set of training samples;
  
  producing a first set of distortion parameters;
  
  selectively distorting said first training sample in accordance with said first set of distortion parameters to dynamically compute a first classifier input pattern;
  
  training the classifier using said first classifier input pattern;
  
  selecting a second training sample from said set of training samples;
  
  producing a second set of distortion parameters;
  
  selectively distorting said second training sample in accordance with said second set of distortion parameters to dynamically compute a second classifier input pattern; and
  
  training the classifier using said second classifier input pattern.

24. A method for training a statistical classifier to recognize input patterns that belong to respective predetermined classes, utilizing a set of training samples which are respectively associated with said classes, comprising the following steps:
- selecting a training sample from said set of training samples;
  
  producing a first set of distortion parameters;
  
  selectively distorting said training sample in accordance with said first set of distortion parameters to compute a first classifier input pattern;
  
  training the classifier using said first classifier input pattern;
  
  subsequently selecting said training sample from said set of training samples;
  
  producing a second set of distortion parameters;
  
  selectively distorting said training sample in accordance with said second set of distortion parameters, to compute a second classifier input pattern; and
  
  training the classifier using said second classifier input pattern.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Apple Inc.
Original Assignee
Apple Computer Incorporated (Apple Inc.)
Inventors
Stafford, William, Lyon, Richard F.
Primary Examiner(s)
Downs, Robert W.

Application Number

US08/512,361
Time in Patent Office

1,372 Days
Field of Search

395/23, 395/20, 382/155, 382/156, 382/157, 382/159, 382/160, 382/170, 382/190, 706/25, 706/20
US Class Current

706/25
CPC Class Codes

G06F 18/28 Determining representative ...

Method for training a statistical classifier with reduced tendency for overfitting

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

24 Claims

Specification

Solutions

Use Cases

Quick Links

Method for training a statistical classifier with reduced tendency for overfitting

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

24 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links