SYSTEM AND METHOD OF WORD LATTICE AUGMENTATION USING A PRE/POST VOCALIC CONSONANT DISTINCTION
First Claim
1. The method for recognizing speech, the method comprising:
- receiving an input speech having at least one pre-vocalic consonant or at least one post-vocalic consonant;
generating at least one output lattice that calculates a first score by comparing the input speech to a training model to provide a result;
distinguishing between the at least one pre-vocalic consonant and the at least one post-vocalic consonant in the input speech;
calculating a second score by measuring a similarity between the at least one pre-vocalic consonant or the at least one post-vocalic consonant in the input speech and the first score;
determining at least one category for at least one pre-vocalic match or mismatch or at least one post-vocalic match or mismatch by using the second score; and
refining the results of the an automated speech recognition (ASR) system by using the at least one category for at least one pre-vocalic match or mismatch or at least one post-vocalic match or mismatch.
3 Assignments
0 Petitions
Accused Products
Abstract
Disclosed are systems and methods for recognizing speech in a spoken dialogue system. The method includes (1) receiving an input speech having at least one pre-vocalic consonant or at least one post-vocalic consonant, (2) generating at least one output lattice that calculates a first score by comparing the input speech to a training model to provide a result; (3) distinguishing between the at least one pre-vocalic consonant and the at least one post-vocalic consonant in the input speech, (4) calculating a second score by measuring a similarity between the at least one pre-vocalic consonant or the at least one post vocalic consonant in the input speech and the first score, (5) determining at least one category for at least one pre-vocalic match or mismatch or at least one post-vocalic match or mismatch by using the second score, and (6) refining the results of the an automated speech recognition (ASR) system by using the at least one category for at least one pre-vocalic match or mismatch or at least one post-vocalic match or mismatch.
6 Citations
21 Claims
-
1. The method for recognizing speech, the method comprising:
-
receiving an input speech having at least one pre-vocalic consonant or at least one post-vocalic consonant; generating at least one output lattice that calculates a first score by comparing the input speech to a training model to provide a result; distinguishing between the at least one pre-vocalic consonant and the at least one post-vocalic consonant in the input speech; calculating a second score by measuring a similarity between the at least one pre-vocalic consonant or the at least one post-vocalic consonant in the input speech and the first score; determining at least one category for at least one pre-vocalic match or mismatch or at least one post-vocalic match or mismatch by using the second score; and refining the results of the an automated speech recognition (ASR) system by using the at least one category for at least one pre-vocalic match or mismatch or at least one post-vocalic match or mismatch. - View Dependent Claims (2, 3, 4, 5, 6, 7, 9, 10, 11, 12, 13, 14, 16, 17, 18, 19, 20, 21)
-
-
8. A system for recognizing speech, the system comprising:
-
a module configured to receive an input speech having at least one pre-vocalic consonant or at least one post-vocalic consonant; a module configured to generate at least one output lattice that calculates a first score by comparing the input speech to a training model to provide a result; a module configured to distinguish between the at least one pre-vocalic consonant and the at least one post-vocalic consonant in the input speech; a module configured to calculate a second score by measuring a similarity between the at least one pre-vocalic consonant or the at least one post vocalic consonant in the input speech and the first score; a module configured to determine at least one category for at least one pre-vocalic match or mismatch or at least one post-vocalic match or mismatch by using the second score; and a module configured to refine the results of the an automated speech recognition (ASR) system by using the at least one category for at least one pre-vocalic match or mismatch or at least one post-vocalic match or mismatch.
-
-
15. A computer-readable medium storing instructions for controlling a computing device to process speech, the instructions comprising:
-
receiving an input speech having at least one pre-vocalic consonant or at least one post-vocalic consonant; generating at least one output lattice that calculates a first score by comparing the input speech to a training model to provide a result; distinguishing between the at least one pre-vocalic consonant and the at least one post-vocalic consonant in the input speech; calculating a second score by measuring a similarity between the at least one pre-vocalic consonant or the at least one post-vocalic consonant in the input speech and the first score; determining at least one category for at least one pre-vocalic match or mismatch or at least one post-vocalic match or mismatch by using the second score; and refining the results of the an automated speech recognition (ASR) system by using the at least one category for at least one pre-vocalic match or mismatch or at least one post-vocalic match or mismatch.
-
Specification