Method and system for speech recognition
First Claim
1. Method of speech recognition in order to identify a speech command as a match to a written text command, and comprising steps of:
- providing a text input from a text database;
receiving an acoustic input;
generating sequences of multilingual phoneme symbols based on said text input by means of a multilingual text-to phoneme module;
generating pronunciations in response to said sequences of multilingual phoneme symbols; and
comparing said pronunciations with the acoustic input in order to find a match.
2 Assignments
0 Petitions
Accused Products
Abstract
There is provided a novel approach for generating multilingual text-to-phoneme mappings for use in multilingual speech recognition systems. The multilingual mappings are based on the weighted outputs from a neural network text-to-phoneme model, trained on data mixed from several languages. The multilingual mappings used together with a branched grammar decoding scheme is able to capture both inter- and intra-language pronunciation variations which is ideal for multilingual speaker independent speech recognition systems. A significant improvement in overall system performance is obtained for a multilingual speaker independent name dialing task when applying multilingual instead of language dependent text-to-phoneme mapping.
-
Citations
13 Claims
-
1. Method of speech recognition in order to identify a speech command as a match to a written text command, and comprising steps of:
-
providing a text input from a text database;
receiving an acoustic input;
generating sequences of multilingual phoneme symbols based on said text input by means of a multilingual text-to phoneme module;
generating pronunciations in response to said sequences of multilingual phoneme symbols; and
comparing said pronunciations with the acoustic input in order to find a match. - View Dependent Claims (2, 3)
-
-
4. System for speech recognition and comprising:
-
a text database for providing a text input;
transducer means for receiving an acoustic input;
a multilingual text-to phoneme module for outputting sequences of multilingual phoneme symbols based on said text input;
pronunciation lexicon module receiving said sequences of multilingual phoneme symbols from said multilingual text-to phoneme module, and for generating pronunciations in response thereto; and
a multilingual recognizer based on multilingual acoustic phoneme models for comparing said pronunciations generated by the pronunciation lexicon module with the acoustic input in order to find a match. - View Dependent Claims (5, 6, 7, 8)
-
-
9. Communication terminal having for speech recognition unit comprising:
-
a text database for providing a text input;
transducer means for receiving an acoustic input;
a multilingual text-to phoneme module for outputting sequences of multilingual phoneme symbols based on said text input;
pronunciation lexicon module receiving said sequences of multilingual phoneme symbols from said multilingual text-to phoneme module, and for generating pronunciations in response thereto; and
a multilingual recognizer based on multilingual acoustic phoneme models for comparing said pronunciations generated by the pronunciation lexicon module with the acoustic input in order to find a match. - View Dependent Claims (10, 11, 12, 13)
-
Specification