MULTIMODAL UNIFICATION OF ARTICULATION FOR DEVICE INTERFACING
First Claim
1. A system for a multimodal unification of articulation, comprising:
- a voice signal modality receiving a voice signal;
a control signal modality receiving an input from a user while the voice signal is being inputted, the control signal modality generating a control signal from the input, the input selected from predetermined inputs to help decipher ambiguities arising from syllable boundary, word boundary, homonym, prosody, or intonation; and
a multimodal integration system receiving and integrating the voice signal and the control signal, the multimodal integration system comprising an inference engine to delimit a context of a spoken utterance of the voice signal by discretizing the voice signal into phonetic frames, the inference engine analyzing the discretized voice signal integrated with the control signal to output a recognition result.
0 Assignments
0 Petitions
Accused Products
Abstract
A system for a multimodal unification of articulation includes a voice signal modality to receive a voice signal, and a control signal modality which receives an input from a user and generates a control signal from the input which is selected from predetermined inputs directly corresponding to the phonetic information. The interactive voice based phonetic input system also includes a multimodal integration system to receive and integrates the voice signal and the control signal. The multimodal integration system delimits a context of a spoken utterance of the voice signal by using the control signal to preprocess and discretize into phonetic frames. A voice recognizer analyzing the voice signal integrated with the control signal to output a voice recognition result. This new paradigm helps overcome constraints found in interfacing mobile devices. Context information facilitates the handling of the commands in the application environment.
-
Citations
30 Claims
-
1. A system for a multimodal unification of articulation, comprising:
-
a voice signal modality receiving a voice signal; a control signal modality receiving an input from a user while the voice signal is being inputted, the control signal modality generating a control signal from the input, the input selected from predetermined inputs to help decipher ambiguities arising from syllable boundary, word boundary, homonym, prosody, or intonation; and a multimodal integration system receiving and integrating the voice signal and the control signal, the multimodal integration system comprising an inference engine to delimit a context of a spoken utterance of the voice signal by discretizing the voice signal into phonetic frames, the inference engine analyzing the discretized voice signal integrated with the control signal to output a recognition result. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
-
25. A method for performing a multimodal unification of articulation, comprising:
-
receiving a voice signal; receiving an input from a user while the voice signal is being received, the input selected from predetermined inputs directly corresponding to phonetic information; generating a control signal generated by the input from the user to make the control signal carry phonetic information of the voice signal; integrating the voice signal and the control signal; discretizing the voice signal into phonetic frames to delimit a context of a spoken utterance of the voice signal; and analyzing the discretized voice signal integrated with the control signal to output a recognition result. - View Dependent Claims (26, 27, 28, 29, 30)
-
Specification