Method and apparatus for training a voice recognition model database
First Claim
Patent Images
1. A computer-implemented method comprising:
- receiving speech data corresponding to an utterance spoken in a particular noise environment;
for each of a plurality of noise environments that are different than the particular noise environment;
combining the speech data with stored noise data that is associated with the noise environment of the plurality of noise environments, to generate noise-specific, training audio data, andtraining a noise-specific, speech recognition model based at least on the noise-specific, training audio data; and
providing the respective, noise-specific, speech recognition models associated with each of the plurality of noise environments, for output.
2 Assignments
0 Petitions
Accused Products
Abstract
An electronic device digitally combines a single voice input with each of a series of noise samples. Each noise sample is taken from a different audio environment (e.g., street noise, babble, interior car noise). The voice input/noise sample combinations are used to train a voice recognition model database without the user having to repeat the voice input in each of the different environments. In one variation, the electronic device transmits the user'"'"'s voice input to a server that maintains and trains the voice recognition model database.
-
Citations
20 Claims
-
1. A computer-implemented method comprising:
-
receiving speech data corresponding to an utterance spoken in a particular noise environment; for each of a plurality of noise environments that are different than the particular noise environment; combining the speech data with stored noise data that is associated with the noise environment of the plurality of noise environments, to generate noise-specific, training audio data, and training a noise-specific, speech recognition model based at least on the noise-specific, training audio data; and providing the respective, noise-specific, speech recognition models associated with each of the plurality of noise environments, for output. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system comprising:
one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising; receiving speech data corresponding to an utterance spoken in a particular noise environment; for each of a plurality of noise environments that are different than the particular noise environment; combining the speech data with stored noise data that is associated with the noise environment of the plurality of noise environments, to generate noise-specific, training audio data, and training a noise-specific, speech recognition model based at least on the noise-specific, training audio data; and providing the respective, noise-specific, speech recognition models associated with each of the plurality of noise environments, for output. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
17. A non-transitory computer-readable medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising:
-
receiving speech data corresponding to an utterance spoken in a particular noise environment; for each of a plurality of noise environments that are different than the particular noise environment; combining the speech data with stored noise data that is associated with the noise environment of the plurality of noise environments, to generate noise-specific, training audio data, and training a noise-specific, speech recognition model based at least on the noise-specific, training audio data; and providing the respective, noise-specific, speech recognition models associated with each of the plurality of noise environments, for output. - View Dependent Claims (18, 19, 20)
-
Specification