Arrangement of speaker-independent speech recognition
First Claim
1. A method of forming a pronunciation model for speech recognition in a telecommunications system comprising at least one portable electronic device and a server, the electronic device being configured to compare the user'"'"'s speech information with pronunciation models comprising acoustic units and stored in the electronic device, the method comprising:
- transferring a character sequence from the electronic device to the server, comparing the character sequence with a language resource to determine the language of the character sequence in a language selector of the electronic device,selecting a text to phenome resource, stored in the server, according to the determined language;
converting the character sequence in the server into at least one phoneme sequence in text format in accordance with the determined language; and
transferring said at least one phoneme sequence in text format from the server to the electronic device;
storing the phoneme sequence in association with the transferred character sequence in the electronic device;
forming an audio model of the phoneme sequence at the electronic device;
storing the audio model in the electronic device in association with the character sequence; and
comparing received speech information with the stored phoneme sequence and if a match results, selecting the character sequence associated with the matching phoneme sequence.
4 Assignments
0 Petitions
Accused Products
Abstract
A method needed in speech recognition for forming a pronunciation model in a telecommunications system comprising at least one portable electronic device and server. The electronic device is arranged to compare the user'"'"'s speech information with pronunciation models comprising acoustic units and stored in the electronic device. A character sequence is transferred from the electronic device to the server. In the server, the character device is converted into a sequence of acoustic units. A sequence of acoustic units is sent from the server to the electronic device.
28 Citations
13 Claims
-
1. A method of forming a pronunciation model for speech recognition in a telecommunications system comprising at least one portable electronic device and a server, the electronic device being configured to compare the user'"'"'s speech information with pronunciation models comprising acoustic units and stored in the electronic device, the method comprising:
-
transferring a character sequence from the electronic device to the server, comparing the character sequence with a language resource to determine the language of the character sequence in a language selector of the electronic device, selecting a text to phenome resource, stored in the server, according to the determined language; converting the character sequence in the server into at least one phoneme sequence in text format in accordance with the determined language; and
transferring said at least one phoneme sequence in text format from the server to the electronic device;storing the phoneme sequence in association with the transferred character sequence in the electronic device; forming an audio model of the phoneme sequence at the electronic device; storing the audio model in the electronic device in association with the character sequence; and comparing received speech information with the stored phoneme sequence and if a match results, selecting the character sequence associated with the matching phoneme sequence. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A telecommunications system comprising:
-
at least one electronic device and a server, wherein the electronic device is configured to send a character sequence intended for speech recognition to the server; a language selector of the electronic device, adapted to compare the character sequence with a language resource to determine the language of the character sequence; wherein the server is configured to convert the character sequence into at least one phoneme sequence in accordance with the determined language and the server is configured to send said at least one phoneme sequence to the electronic device;
wherein the electronic device is configured to form an audio model of the phoneme sequence and to store said audio model in association with the character sequence;a memory in the electronic device for storing the phoneme sequence in association with the transferred character sequence in the electronic device; and further wherein the electronic device is configured to compare the user'"'"'s speech information with the stored phoneme sequences stored in the electronic device and if a match results, selecting the character sequence associated with the matching phoneme sequence.
-
-
8. An electronic device comprising:
-
a language selector adapted to compare a character sequence with a language resource to determine a language of the character sequence intended for speech recognition; a transmitter adapted to send the character sequence intended for speech recognition and information on the language of the character sequence to a server; a receiver adapted to receive a phoneme sequence formed of the character sequence in accordance with the determined language from the server; a processor in the electronic device configured to form an audio model of the phoneme sequence and to store said audio model in association with said character sequence; a memory adapted to store the phoneme sequence in association with the character sequence; and wherein the processor is adapted to compare user'"'"'s speech information with stored phoneme sequence and if a match results, select the character sequence associated with the matching phoneme sequence. - View Dependent Claims (9, 10)
-
-
11. A server comprising:
-
a communication connection for receiving a character sequence from at least one electronic device; a language selector adapted to compare the character sequence with a language resource to determine a language for the character sequence intended for speech recognition; a processor for converting the character sequence into at least one phoneme sequence in accordance with the determined language; and wherein said processor is adapted to send the at least one phoneme sequence to the electronic device over the communication connection; and and wherein the phoneme sequence is received and further processed and stored in association with the character sequence in the electronic device; and wherein a processor in the electronic device is adapted to compare a user'"'"'s speech information with the stored phoneme sequence and if a match results, select the character sequence associated with the matching phoneme sequence; and further wherein the processor in the electronic device is configured to form a audio model of the phoneme sequence and to store said speech model in association with the character sequence. - View Dependent Claims (12)
-
-
13. A computer program product comprising
a computer readable program code stored in a memory for causing a computer to perform speech recognition, said computer program product further comprising: -
computer readable program code means for causing a computer server to receive a character sequence for speech recognition from an electronic device; computer readable program code means for causing the computer server to compare the character sequence with a language resource to determine a language of the character sequence; computer readable program code means for causing the computer server to convert the character sequence into at least one phoneme sequence in text format in accordance with the determined language; and computer readable program code means for causing a processor in said electronic device to receive the at least one phoneme sequence in text format from the server; computer readable program code means for causing the processor to store the phoneme sequence in association with the character sequence; and
compare received speech information with the stored phoneme sequence and, if a match results, select the character sequence associated with the matching phoneme sequence; andcomputer readable program code means for causing the processor to form a audio model of the phoneme sequence and to store said speech model in association with the character sequence.
-
Specification