High-order entropy error functions for neural classifiers

US 7,346,497 B2
Filed: 05/08/2001
Issued: 03/18/2008
Est. Priority Date: 05/08/2001
Status: Expired due to Fees

First Claim

Patent Images

1. An artificial neural network classifier comprising:

an input layer of one or more nodes to receive training data;

an output layer L of one or more nodes to provide an actual output indicating a level of training of the artificial neural network classifier based on the training data; and

at least a hidden layer of one or more nodes intermediate said input layer and said output layer, each node to receive input values via the input layer, each node to perform a transformation of the received input values based on a set of weights, each transformation to determine in part the actual output of output layer L, the set of weights to be updated based at least in part on an error function having an operator of the form $\frac{{(2 t_{j} - 1)}^{n - 1} {(t_{j} - y_{j}^{L})}^{n}}{y_{j}^{L} (1 - y_{j}^{L})},$ where y_j^Lis an actual output at node j of output layer L, t_jis a target output at the node j, and n is greater than or equal to two, wherein the updated set of weights is to be used in determining confidence level information for a feature vector.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An automatic speech recognition system comprising a speech decoder to resolve phone and word level information, a vector generator to generate information vectors on which a confidence measure is based by a neural network classifier (ANN). An error signal is designed which is not subject to false saturation or over specialization. The error signal is integrated into an error function which is back propagated through the ANN.

Citations

12 Claims

1. An artificial neural network classifier comprising:
- an input layer of one or more nodes to receive training data;
  
  an output layer L of one or more nodes to provide an actual output indicating a level of training of the artificial neural network classifier based on the training data; and
  
  at least a hidden layer of one or more nodes intermediate said input layer and said output layer, each node to receive input values via the input layer, each node to perform a transformation of the received input values based on a set of weights, each transformation to determine in part the actual output of output layer L, the set of weights to be updated based at least in part on an error function having an operator of the form $\frac{{(2 t_{j} - 1)}^{n - 1} {(t_{j} - y_{j}^{L})}^{n}}{y_{j}^{L} (1 - y_{j}^{L})},$ where y_j^Lis an actual output at node j of output layer L, t_jis a target output at the node j, and n is greater than or equal to two, wherein the updated set of weights is to be used in determining confidence level information for a feature vector.
- View Dependent Claims (2, 3)
- - 2. The artificial neural network classifier of claim 1 wherein 2≦
    - n≦
      
      3.
  - 3. The artificial neural network classifier according to claim 2 wherein updating the set of weights based at least in part on an error function having an operator of the form (t_j−
    - y_j^L)ⁿcomprises updating the set of weights based at least in part on an error signal derived from the error function, the error signal having an operator of the form (2t_j−
      
      1)^n−
      
      1(t_j−
      
      y_j^L)ⁿ.

4. An automatic speech recognition system comprising:
- a speech decoder to decode acoustic data;
  
  a vector generator to calculate a feature vector based on the decoded acoustic data; and
  
  an adaptive neural network classifier to determine a confidence in the calculated feature vector, the determining based on a training of the adaptive neural classifier according to an error function having an operator of the form $\frac{{(2 t_{j} - 1)}^{n - 1} {(t_{j} - y_{j}^{L})}^{n}}{y_{j}^{L} (1 - y_{j}^{L})},$ where y_j^Lis an actual output at node j of an output layer L of the adaptive neural network classifier, t_jis a target output at the node j, and n is greater than or equal to two.
- View Dependent Claims (5, 6, 7, 8)
- - 5. The automatic speech recognition system according to claim 4 wherein said vector generator comprises a normalizing circuit for determining a normalization x′
    - of a feature vector x according to
6. The automatic speech recognition system according to claim 4 wherein said vector generator comprises calculation circuits to produce an 8-dimensional feature vector.
7. The automatic speech recognition system according to claim 6 wherein said 8 dimensions comprise A-stabil, LogAWE-end, NscoreQ, N-active-leafs, NScore, Score-per-frame, Duration.
8. The automatic speech recognition system according to claim 4 wherein said classifier comprises a multi-layer perceptron (MLP).

9. A method for training a neural network classifier to produce an output indicative of a confidence level for a decoded word, the method comprising:
- providing a first training pattern to the neural network classifier having an output layer L of one or more nodes, the output layer L to provide an output indicating a level of training of the neural network classifier;
  
  forward propagating the training pattern through the neural network classifier, the propagating based in part on a set of weights;
  
  determining an error signal based on the propagated training pattern, the error based on an error function having an operator of the form $\frac{{(2 t_{j} - 1)}^{n - 1} {(t_{j} - y_{j}^{L})}^{n}}{y_{j}^{L} (1 - y_{j}^{L})},$ where y_j^Lis an actual output at node j of output layer L of the neural network classifier, t_jis the value of a target output at the node j, and n is greater than or equal to two; and
  
  updating the set of weights based on the determined error signal, the updated set of weights to be used in producing the output indicative of the confidence level for the decoded word.
- View Dependent Claims (10)
- - 10. The method according to claim 9 wherein 2≦
    - n≦
      
      3.

11. A computer-readable medium storing instructions thereon which when executed by one or more processors cause the one or more processors to perform the method of:
- implementing a multiple layer perceptron having an output layer L of one or more nodes, the output layer L to provide an output indicating a level of training of the multiple layer perceptron,performing, via the implemented multiple layer perceptron, a series of linear transformations of a set of training data, the linear transformations based on a set of weights;
  
  determining an error signal for an output of the linear transformations, the error signal based on an error function having an operator of the form $\frac{{(2 t_{j} - 1)}^{n - 1} {(t_{j} - y_{j}^{L})}^{n}}{y_{j}^{L} (1 - y_{j}^{L})},$ where y_j^Lis an actual output at node j of output layer L, t_jis a target output at a node j, and n is greater than or equal to two;
  
  updating the set of weights based at least in part on the error signal;
  
  receiving a multiple-dimension feature vector based on decoded acoustic data; and
  
  determining, via the multiple layer perceptron, a confidence level of the feature vector, the determining based on the updated set of weights.
- View Dependent Claims (12)
- - 12. A computer-readable medium according to claim 11, the method further comprising:
    - normalizing each said multiple dimensions of said output vector in accordance with the maximum and minimum values for each dimension over a set of training data.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Intel Corporation
Original Assignee
Intel Corporation
Inventors
Pi, Xiaobo, Jia, Ying
Primary Examiner(s)
Opsasnick; Michael N

Application Number

US10/332,651
Publication Number

US 20050015251A1
Time in Patent Office

2,506 Days
Field of Search

704/202, 704/236, 704/219, 704/232, 704/253, 704/256.8, 382156-159, 382/185, 382/228, 706/12, 706/16, 706/20, 706/25
US Class Current

704/202
CPC Class Codes

G10L 15/063   Training

G10L 15/10   using distance or distortio...

G10L 15/16   using artificial neural net...

High-order entropy error functions for neural classifiers

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

12 Claims

Specification

Solutions

Use Cases

Quick Links

High-order entropy error functions for neural classifiers

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

12 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links