SHARP DISCREPANCY LEARNING

US 20160180214A1
Filed: 12/19/2014
Published: 06/23/2016
Est. Priority Date: 12/19/2014
Status: Abandoned Application

First Claim

Patent Images

1. A method comprising:

providing training data to a neural network that includes an output layer and one or more hidden layers, each of the hidden layers comprising multiple nodes and corresponding parameters;

calculating a gradient for the neural network by applying a sharp discrepancy output layer objective function to the output layer, wherein the sharp discrepancy output layer objective function is dependent on the training data and parameters;

training the neural network using the gradient to determine a probability that data received by the neural network has features similar to key features of one or more keywords or key phrases, wherein training the neural network using the gradient comprises using the gradient to update the parameters.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. One of the methods includes training a neural network using sharp discrepancy learning by providing training data to the neural network, calculating a gradient using a sharp discrepancy output layer objective function to classify the neural network parameters for correct and incorrect network model states, and training the neural network using the gradient to determine a probability that data received by the neural network has features similar to key features of one or more keywords or key phrases.

Citations

20 Claims

1. A method comprising:
- providing training data to a neural network that includes an output layer and one or more hidden layers, each of the hidden layers comprising multiple nodes and corresponding parameters;
  
  calculating a gradient for the neural network by applying a sharp discrepancy output layer objective function to the output layer, wherein the sharp discrepancy output layer objective function is dependent on the training data and parameters;
  
  training the neural network using the gradient to determine a probability that data received by the neural network has features similar to key features of one or more keywords or key phrases, wherein training the neural network using the gradient comprises using the gradient to update the parameters.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
- - 2. The method of claim 1, comprising providing the trained neural network for use in a speech recognition system, wherein the speech recognition system uses sharp discrepancy learning on real data.
  - 3. The method of claim 1, wherein calculating the gradient for the neural network by applying a sharp discrepancy output layer objective function to the output layer comprises calculating the gradient of a cross-entropy function.
  - 4. The method of claim 1, wherein the sharp discrepancy output layer objective function comprises a class of sharp discrepancy objective functions with a fraction whose denominator is a product of shifted label scores over a set of labels that correspond to a set of states that are designated as incorrect states.
  - 5. The method of claim 4, wherein the label scores each comprise an exponential of a product of a label, parameter matrix and training data point.
  - 6. The method of claim 4, wherein the class of sharp discrepancy objective functions comprise functions with a fraction whose numerator is a non-negative label score associated with a state that is designated as a correct state.
  - 7. The method of claim 1, wherein calculating the gradient comprises calculating each component of the gradient separately.
  - 8. The method of claim 1, wherein calculating the gradient comprises calculating each component of the gradient in parallel.
  - 9. The method of claim 1, wherein the neural network comprises a deep neural network.
  - 10. The method of claim 1, wherein the neural network comprises a deep belief network.
  - 11. The method of claim 1, wherein the training data comprises a plurality of feature vectors and a plurality of label vectors that each indicate whether the corresponding feature vector corresponds to i) one of the keywords or key phrases, or ii) not.
  - 12. The method of claim 11, wherein each of the plurality of feature vectors represent a different portion of an audio waveform from a received digital representation of speech.
  - 13. The method of claim 12, wherein the digital representation of speech comprises recorded speech data.
  - 14. The method of claim 11, wherein each of the plurality of label vectors corresponds to one of the feature vectors, and specifies a probability distribution for whether the corresponding feature vector corresponds to i) one of the keywords or key phrases, or ii) not.
  - 15. The method of claim 14, wherein the probability distribution comprises a multinomial distribution.
  - 16. The method of claim 1, wherein training the neural network using the gradient comprises iterating the parameter updates until an end criteria is met.
  - 17. The method of claim 1, comprising calculating, using the hidden layers, an exponential of a product of a value of one of the parameters and a point from the training data.

18. A system comprising:
- one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising;
  
  providing training data to a neural network that includes an output layer and one or more hidden layers, each of the hidden layers comprising multiple nodes and corresponding parameters;
  
  calculating a gradient for the neural network by applying a sharp discrepancy output layer objective function to the output layer, wherein the sharp discrepancy output layer objective function is dependent on the training data and parameters;
  
  training the neural network using the gradient to determine a probability that data received by the neural network has features similar to key features of one or more keywords or key phrases, wherein training the neural network using the gradient comprises using the gradient to update the parameters.
- View Dependent Claims (19)
- - 19. The system of claim 18, wherein the sharp discrepancy output layer objective function comprises a class of sharp discrepancy objective functions with a fraction whose denominator is a product of shifted label scores over a set of labels that correspond to a set of states that are designated as incorrect states.

20. A computer-readable medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising:
- providing training data to a neural network that includes an output layer and one or more hidden layers, each of the hidden layers comprising multiple nodes and corresponding parameters;
  
  calculating a gradient for the neural network by applying a sharp discrepancy output layer objective function to the output layer, wherein the sharp discrepancy output layer objective function is dependent on the training data and parameters;
  
  training the neural network using the gradient to determine a probability that data received by the neural network has features similar to key features of one or more keywords or key phrases, wherein training the neural network using the gradient comprises using the gradient to update the parameters.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Ulianov, Dmitrii Vladimirovich, Kanevsky, Dimitri, Moreno, Ignacio Lopez

Application Number

US14/577,301
Publication Number

US 20160180214A1
Time in Patent Office

Days
Field of Search
US Class Current

1/1
CPC Class Codes

G06N 3/045   Combinations of networks

G06N 3/08   Learning methods

G10L 15/063   Training

G10L 2015/088   Word spotting

SHARP DISCREPANCY LEARNING

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

SHARP DISCREPANCY LEARNING

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links