GRAPHEME-TO-PHONEME CONVERSION USING ACOUSTIC DATA
First Claim
1. In a computing environment, a method comprising:
- modeling acoustic data, a phoneme sequence, a grapheme sequence and an alignment between the phoneme sequence and the grapheme sequence to provide a graphoneme model; and
retraining a grapheme to phoneme model usable in speech recognition by optimizing the graphoneme model using acoustic data.
2 Assignments
0 Petitions
Accused Products
Abstract
Described is the use of acoustic data to improve grapheme-to-phoneme conversion for speech recognition, such as to more accurately recognize spoken names in a voice-dialing system. A joint model of acoustics and graphonemes (acoustic data, phonemes sequences, grapheme sequences and an alignment between phoneme sequences and grapheme sequences) is described, as is retraining by maximum likelihood training and discriminative training in adapting graphoneme model parameters using acoustic data. Also described is the unsupervised collection of grapheme labels for received acoustic data, thereby automatically obtaining a substantial number of actual samples that may be used in retraining. Speech input that does not meet a confidence threshold may be filtered out so as to not be used by the retrained model.
53 Citations
20 Claims
-
1. In a computing environment, a method comprising:
-
modeling acoustic data, a phoneme sequence, a grapheme sequence and an alignment between the phoneme sequence and the grapheme sequence to provide a graphoneme model; and retraining a grapheme to phoneme model usable in speech recognition by optimizing the graphoneme model using acoustic data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. In a computing environment, a system comprising:
-
a grapheme to phoneme model; a recognizer coupled to the grapheme to phoneme model to recognize input speech as a corresponding grapheme sequence; and a retraining mechanism coupled to the recognizer that retrains the grapheme to phoneme model into a retrained grapheme to phoneme model based upon acoustic data and associated graphemes collected by a recognition system. - View Dependent Claims (12, 13, 14, 15, 16)
-
-
17. A computer-readable medium having computer-executable instructions, which when executed perform steps, comprising:
-
receiving acoustic data from a speaker; recognizing the acoustic data as a result and associated potential grapheme sequence; confirming with the speaker whether the result correctly applies to the acoustic data, and if so, associating the acoustic data with an actual grapheme sequence corresponding to the potential grapheme sequence, and if not, further interacting with the speaker until a result is confirmed as correctly applying to the acoustic data and associating the corresponding grapheme sequence as the actual grapheme sequence; and using the acoustic data and associated actual grapheme sequence for subsequent speech recognition. - View Dependent Claims (18, 19, 20)
-
Specification