Using Utterance Classification in Telephony and Speech Recognition Applications
First Claim
1. In a computing environment, a method performed on at least one processor comprising:
- inputting text into a classifier that was trained with speech-recognized acoustic data having associated semantic labels;
classifying the text into one or more of the semantic labels; and
outputting the one or more semantic labels from the classifier.
2 Assignments
0 Petitions
Accused Products
Abstract
Described is the use of utterance classification based methods and other machine learning techniques to provide a telephony application or other voice menu application (e.g., an automotive application) that need not use Context-Free-Grammars to determine a user'"'"'s spoken intent. A classifier receives text from an information retrieval-based speech recognizer and outputs a semantic label corresponding to the likely intent of a user'"'"'s speech. The semantic label is then output, such as for use by a voice menu program in branching between menus. Also described is training, including training the language model from acoustic data without transcriptions, and training the classifier from speech-recognized acoustic data having associated semantic labels.
-
Citations
20 Claims
-
1. In a computing environment, a method performed on at least one processor comprising:
-
inputting text into a classifier that was trained with speech-recognized acoustic data having associated semantic labels; classifying the text into one or more of the semantic labels; and outputting the one or more semantic labels from the classifier. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
- 10. In a computing environment, a system comprising, a voice-menu program, the voice menu program coupled to a classifier trained at least in part via machine learning using data associated with semantic labels of a predetermined set of semantic labels, the classifier configured to input text received from a speech recognizer and search a classification model to match at least one semantic label to the text for providing to the voice menu program.
- 17. One or more computer-readable media having computer-executable instructions, which when executed perform steps, comprising, classifying text into a semantic label of a predetermined set of semantic labels, in which the text corresponds to recognized speech, selecting a menu of a voice menu program based upon the semantic label, and changing the voice menu program to the selected menu.
Specification