TRAINING MULTIPLE NEURAL NETWORKS WITH DIFFERENT ACCURACY
First Claim
1. A method comprising:
- receiving a digital representation of speech;
generating a plurality of feature vectors that each model a different portion of an audio waveform from the digital representation of speech during a different period of time, the plurality of feature vectors including a first feature vector and subsequent feature vectors;
generating a first posterior probability vector for the first feature vector using a first neural network, the first posterior probability vector comprising one score for each key word or key phrase which the first neural network is trained to identify;
determining whether one of the scores in the first posterior probability vector satisfies a first threshold value using a first posterior handling module; and
in response to determining that one of the scores in the first posterior probability vector satisfies the first threshold value and for each of the feature vectors;
generating a second posterior probability vector for the respective feature vector using a second neural network, wherein the second neural network is trained to identify the same key words and key phrases as the first neural network, and comprises more inner layer nodes than the first neural network, and the second posterior probability vector comprises one score for each key word or key phrase which the second neural network is trained to identify; and
determining whether one of the scores in the second posterior probability vector satisfies a second threshold value using a second posterior handling module, the second threshold value being more restrictive than the first threshold value.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a deep neural network. One of the methods includes generating a plurality of feature vectors that each model a different portion of an audio waveform, generating a first posterior probability vector for a first feature vector using a first neural network, determining whether one of the scores in the first posterior probability vector satisfies a first threshold value, generating a second posterior probability vector for each subsequent feature vector using a second neural network, wherein the second neural network is trained to identify the same key words and key phrases and includes more inner layer nodes than the first neural network, and determining whether one of the scores in the second posterior probability vector satisfies a second threshold value.
104 Citations
20 Claims
-
1. A method comprising:
-
receiving a digital representation of speech; generating a plurality of feature vectors that each model a different portion of an audio waveform from the digital representation of speech during a different period of time, the plurality of feature vectors including a first feature vector and subsequent feature vectors; generating a first posterior probability vector for the first feature vector using a first neural network, the first posterior probability vector comprising one score for each key word or key phrase which the first neural network is trained to identify; determining whether one of the scores in the first posterior probability vector satisfies a first threshold value using a first posterior handling module; and in response to determining that one of the scores in the first posterior probability vector satisfies the first threshold value and for each of the feature vectors; generating a second posterior probability vector for the respective feature vector using a second neural network, wherein the second neural network is trained to identify the same key words and key phrases as the first neural network, and comprises more inner layer nodes than the first neural network, and the second posterior probability vector comprises one score for each key word or key phrase which the second neural network is trained to identify; and determining whether one of the scores in the second posterior probability vector satisfies a second threshold value using a second posterior handling module, the second threshold value being more restrictive than the first threshold value. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A system comprising:
one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising; training a first neural network to identify a set of features using a first training set, the first neural network comprising a first quantity of nodes; training a second neural network to identify the set of features using a second training set, the second neural network comprising a second quantity of nodes, greater than the first quantity of nodes; and providing the first neural network, and the second neural network to a user device that uses both the first neural network and the second neural network to analyze a data set and determine whether the data set comprises a digital representation of a feature from the set of features. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
-
20. A computer-readable medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising:
-
receiving a digital representation of speech; generating a plurality of feature vectors that each model a different portion of an audio waveform from the digital representation of speech during a different period of time, the plurality of feature vectors including a first feature vector and subsequent feature vectors; generating a first posterior probability vector for the first feature vector using a first neural network, the first posterior probability vector comprising one score for each key word or key phrase which the first neural network is trained to identify; determining whether one of the scores in the first posterior probability vector satisfies a first threshold value using a first posterior handling module; and in response to determining that one of the scores in the first posterior probability vector satisfies the first threshold value and for each of the feature vectors; generating a second posterior probability vector for the respective feature vector using a second neural network, wherein the second neural network is trained to identify the same key words and key phrases as the first neural network, and comprises more inner layer nodes than the first neural network, and the second posterior probability vector comprises one score for each key word or key phrase which the second neural network is trained to identify; and determining whether one of the scores in the second posterior probability vector satisfies a second threshold value using a second posterior handling module, the second threshold value being more restrictive than the first threshold value.
-
Specification