TRAINING MULTIPLE NEURAL NETWORKS WITH DIFFERENT ACCURACY

US 20150340032A1
Filed: 05/23/2014
Published: 11/26/2015
Est. Priority Date: 05/23/2014
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

receiving a digital representation of speech;

generating a plurality of feature vectors that each model a different portion of an audio waveform from the digital representation of speech during a different period of time, the plurality of feature vectors including a first feature vector and subsequent feature vectors;

generating a first posterior probability vector for the first feature vector using a first neural network, the first posterior probability vector comprising one score for each key word or key phrase which the first neural network is trained to identify;

determining whether one of the scores in the first posterior probability vector satisfies a first threshold value using a first posterior handling module; and

in response to determining that one of the scores in the first posterior probability vector satisfies the first threshold value and for each of the feature vectors;

generating a second posterior probability vector for the respective feature vector using a second neural network, wherein the second neural network is trained to identify the same key words and key phrases as the first neural network, and comprises more inner layer nodes than the first neural network, and the second posterior probability vector comprises one score for each key word or key phrase which the second neural network is trained to identify; and

determining whether one of the scores in the second posterior probability vector satisfies a second threshold value using a second posterior handling module, the second threshold value being more restrictive than the first threshold value.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a deep neural network. One of the methods includes generating a plurality of feature vectors that each model a different portion of an audio waveform, generating a first posterior probability vector for a first feature vector using a first neural network, determining whether one of the scores in the first posterior probability vector satisfies a first threshold value, generating a second posterior probability vector for each subsequent feature vector using a second neural network, wherein the second neural network is trained to identify the same key words and key phrases and includes more inner layer nodes than the first neural network, and determining whether one of the scores in the second posterior probability vector satisfies a second threshold value.

104 Citations

View as Search Results

20 Claims

1. A method comprising:
- receiving a digital representation of speech;
  
  generating a plurality of feature vectors that each model a different portion of an audio waveform from the digital representation of speech during a different period of time, the plurality of feature vectors including a first feature vector and subsequent feature vectors;
  
  generating a first posterior probability vector for the first feature vector using a first neural network, the first posterior probability vector comprising one score for each key word or key phrase which the first neural network is trained to identify;
  
  determining whether one of the scores in the first posterior probability vector satisfies a first threshold value using a first posterior handling module; and
  
  in response to determining that one of the scores in the first posterior probability vector satisfies the first threshold value and for each of the feature vectors;
  
  generating a second posterior probability vector for the respective feature vector using a second neural network, wherein the second neural network is trained to identify the same key words and key phrases as the first neural network, and comprises more inner layer nodes than the first neural network, and the second posterior probability vector comprises one score for each key word or key phrase which the second neural network is trained to identify; and
  
  determining whether one of the scores in the second posterior probability vector satisfies a second threshold value using a second posterior handling module, the second threshold value being more restrictive than the first threshold value.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The method of claim 1, comprising:
    - storing the first feature vector in a memory; and
      
      providing the first feature vector from the memory to the second neural network in response to determining that one of the scores in the first posterior probability vector satisfies the first threshold value.
  - 3. The method of claim 1, comprising:
    - generating a third posterior probability vector for each of the subsequent feature vectors using the first neural network; and
      
      determining whether one of the scores in each of the third posterior probability vectors satisfies the first threshold value using the first posterior handling module until the first posterior handling module determines that none of the scores in an particular third posterior probability vector satisfies the first threshold value.
  - 4. The method of claim 3, wherein the second neural network generates the second posterior probability vector and the second posterior handling module determines whether one of the scores in the second posterior probability vector satisfies the second threshold value for each of the subsequent feature vectors until the first posterior handling module determines that none of the scores in the particular third posterior probability vector satisfies the first threshold value.
  - 5. The method of claim 1, wherein the second neural network receives each of the subsequent feature vectors from a front-end feature extraction module.
  - 6. The method of claim 1, comprising:
    - identifying a predetermined clock frequency for a processor to perform the generation of the first posterior probability vector for the first feature vector using the first neural network.
  - 7. The method of claim 6, wherein the processor is a digital signal processor.
  - 8. The method of claim 1, wherein the first neural network comprises a higher false positive rate than the second neural network.
  - 9. The method of claim 1, wherein the first posterior handling module and the second posterior handling module comprise the same posterior handling module.
  - 10. The method of claim 1, wherein the first threshold value and the second threshold value comprise decimal values between zero and one.
  - 11. The method of claim 1, wherein the second neural network is more accurate than the first neural network.

12. A system comprising:
- one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising;
  
  training a first neural network to identify a set of features using a first training set, the first neural network comprising a first quantity of nodes;
  
  training a second neural network to identify the set of features using a second training set, the second neural network comprising a second quantity of nodes, greater than the first quantity of nodes; and
  
  providing the first neural network, and the second neural network to a user device that uses both the first neural network and the second neural network to analyze a data set and determine whether the data set comprises a digital representation of a feature from the set of features.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
- - 13. The system of claim 12, wherein:
    - the set of features comprises key words and key phrases; and
      
      the user device uses both the first neural network and the second neural network to analyze an audio waveform and determine whether a digital representation of one of the key words or the key phrases from the set of features is included in the audio waveform.
  - 14. The system of claim 13, the operations comprising:
    - providing a feature extraction module, a first posterior handling module, and a second posterior handling module to the user device with the first neural network and the second neural network, wherein the user device uses the feature extraction module, the first posterior handling module, and the second posterior handling module to perform the analysis of the data set.
  - 15. The system of claim 14, wherein the first posterior handling module and the second posterior handling module comprise the same posterior handling module.
  - 16. The system of claim 12, wherein the set of features comprise computer vision features, handwriting recognition features, text classification features, or authentication features.
  - 17. The system of claim 12, wherein the first training set and the second training set comprise the same training set.
  - 18. The system of claim 17, the operations comprising:
    - training the first neural network for a first quantity of iterations; and
      
      training the second neural network for a second quantity of iterations, greater than the first quantity of iterations.
  - 19. The system of claim 12, wherein a ratio between the first quantity of nodes and the second quantity of nodes identifies a performance cost savings of the user device when the user device analyzes a particular portion of the data set with the first neural network and not the second neural network.

20. A computer-readable medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising:
- receiving a digital representation of speech;
  
  generating a plurality of feature vectors that each model a different portion of an audio waveform from the digital representation of speech during a different period of time, the plurality of feature vectors including a first feature vector and subsequent feature vectors;
  
  generating a first posterior probability vector for the first feature vector using a first neural network, the first posterior probability vector comprising one score for each key word or key phrase which the first neural network is trained to identify;
  
  determining whether one of the scores in the first posterior probability vector satisfies a first threshold value using a first posterior handling module; and
  
  in response to determining that one of the scores in the first posterior probability vector satisfies the first threshold value and for each of the feature vectors;
  
  generating a second posterior probability vector for the respective feature vector using a second neural network, wherein the second neural network is trained to identify the same key words and key phrases as the first neural network, and comprises more inner layer nodes than the first neural network, and the second posterior probability vector comprises one score for each key word or key phrase which the second neural network is trained to identify; and
  
  determining whether one of the scores in the second posterior probability vector satisfies a second threshold value using a second posterior handling module, the second threshold value being more restrictive than the first threshold value.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Gruenstein, Alexander H.

Granted Patent

US 9,484,022 B2
Time in Patent Office

Days
Field of Search
US Class Current

1/1
CPC Class Codes

G06F 1/324   by lowering clock frequency

G06F 1/3293   by switching to a less powe...

G06F 16/367   Ontology

G06N 3/045   Combinations of networks

G06N 3/047   Probabilistic or stochastic...

G06N 3/08   Learning methods

G10L 15/16   using artificial neural net...

Y02D 10/00   Energy efficient computing,...

TRAINING MULTIPLE NEURAL NETWORKS WITH DIFFERENT ACCURACY

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

104 Citations

20 Claims

Specification

Use Cases

Quick Links

Others

TRAINING MULTIPLE NEURAL NETWORKS WITH DIFFERENT ACCURACY

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

104 Citations

20 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others