Automatic Grammar Augmentation For Robust Voice Command Recognition
First Claim
1. A method of improving voice command recognition, comprising:
- applying an acoustic model to a general speech dataset to generate a statistical pronunciation dictionary;
generating an augmented grammar candidate set based on the statistical pronunciation dictionary and an original grammar set, wherein;
the original grammar set comprises voice commands to be recognized; and
each element of the augmented grammar candidate set comprises a variation of one of the voice commands to be recognized; and
generating an augmented grammar set by adding one or more elements of the augmented grammar candidate set to the original grammar set.
1 Assignment
0 Petitions
Accused Products
Abstract
Various embodiments include methods and devices for implementing automatic grammar augmentation for improving voice command recognition accuracy in systems with a small footprint acoustic model. Alternative expressions that may capture acoustic model decoding variations may be added to a grammar set. An acoustic model-specific statistical pronunciation dictionary may be derived by running the acoustic model through a large general speech dataset and constructing a command-specific candidate set containing potential grammar expressions. Greedy based and cross-entropy-method (CEM) based algorithms may be utilized to search the candidate set for augmentations with improved recognition accuracy.
-
Citations
22 Claims
-
1. A method of improving voice command recognition, comprising:
-
applying an acoustic model to a general speech dataset to generate a statistical pronunciation dictionary; generating an augmented grammar candidate set based on the statistical pronunciation dictionary and an original grammar set, wherein; the original grammar set comprises voice commands to be recognized; and each element of the augmented grammar candidate set comprises a variation of one of the voice commands to be recognized; and generating an augmented grammar set by adding one or more elements of the augmented grammar candidate set to the original grammar set. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computing device, comprising:
-
a memory; and a processor coupled to the memory and configured with processor executable instructions to perform operations comprising; applying an acoustic model to a general speech dataset to generate a statistical pronunciation dictionary; generating an augmented grammar candidate set based on the statistical pronunciation dictionary and an original grammar set, wherein; the original grammar set comprises voice commands to be recognized; and each element of the augmented grammar candidate set comprises a variation of one of the voice commands to be recognized; and generating an augmented grammar set by adding one or more elements of the augmented grammar candidate set to the original grammar set. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computing device, comprising:
-
a memory; means for applying an acoustic model to a general speech dataset to generate a statistical pronunciation dictionary; means for generating an augmented grammar candidate set based on the statistical pronunciation dictionary and an original grammar set, wherein; the original grammar set comprises voice commands to be recognized; and each element of the augmented grammar candidate set comprises a variation of one of the voice commands to be recognized; and means for generating an augmented grammar set by adding one or more elements of the augmented grammar candidate set to the original grammar set.
-
-
16. A non-transitory processor-readable medium having stored thereon processor-executable instructions configured to cause a processor to perform operations comprising:
-
applying an acoustic model to a general speech dataset to generate a statistical pronunciation dictionary; generating an augmented grammar candidate set based on the statistical pronunciation dictionary and an original grammar set, wherein; the original grammar set comprises voice commands to be recognized; and each element of the augmented grammar candidate set comprises a variation of one of the voice commands to be recognized; and generating an augmented grammar set by adding one or more elements of the augmented grammar candidate set to the original grammar set. - View Dependent Claims (17, 18, 19, 20, 21, 22)
-
Specification