Mobile terminal controllable by spoken utterances

US 20020091511A1
Filed: 12/13/2001
Published: 07/11/2002
Est. Priority Date: 12/14/2000
Status: Abandoned Application

First Claim

Patent Images

1. A network server for mobile terminals which are controllable by spoken utterances, comprising:

a unit for providing acoustic models for automatic recognition of the spoken utterances, the unit for providing acoustic models translating a textual transcription of a spoken utterance into a sequence of phonetic transcription units and the sequence of phonetic transcription units into a sequence of phonetic recognition units, the sequence of phonetic recognition units forming an acoustic model of the spoken utterance; and

an interface for transmitting the acoustic models to the mobile terminals.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A mobile terminal (100) which is controllable by spoken utterances like proper names or command words is described. The mobile terminal (100) comprises an interface (200) for receiving from a network server (300) acoustic models for automatic speech recognition and an automatic speech recognizer (110) for recognizing the spoken utterances based on the received acoustic models. The invention further relates to a network server (300) for mobile terminals (100) which are controllable by spoken utterances and to a method for obtaining acoustic models for a mobile terminal (100) controllable by spoken utterances.

Citations

27 Claims

1. A network server for mobile terminals which are controllable by spoken utterances, comprising:
- a unit for providing acoustic models for automatic recognition of the spoken utterances, the unit for providing acoustic models translating a textual transcription of a spoken utterance into a sequence of phonetic transcription units and the sequence of phonetic transcription units into a sequence of phonetic recognition units, the sequence of phonetic recognition units forming an acoustic model of the spoken utterance; and
  
  an interface for transmitting the acoustic models to the mobile terminals.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The network server according to claim 1, wherein the interface allows to receive the textual transcriptions of the spoken utterances from the mobile terminals.
  - 3. The network server according to claim 1, further comprising a pronunciation database containing the phonetic transcription units.
  - 4. The network server according to claim 3, wherein the pronunciation database is shared by both the unit for generating acoustic models and a speech synthesizer.
  - 5. The network server according to claim 1, further comprising a recognition database containing the phonetic recognition units.
  - 6. The network server according to claim 1, further comprising a speech synthesizer.
  - 7. The network server according to claim 6, further comprising a synthesis database containing phonetic synthesizing units.
  - 8. The network server according to claim 1, wherein the interface allows to receive acoustic models of the spoken utterances from a mobile terminal and wherein a database stores the received acoustic models at least temporarily.
  - 9. The network server according to claim 1, wherein the interface allows to receive and transmit voice prompts corresponding to the spoken utterances from the mobile terminals and further comprising a voice prompt database for storing the voice prompts.

10. A network server for mobile terminals which are controllable by spoken utterances, comprising:
- a unit for providing acoustic models for automatic recognition of spoken utterances;
  
  a speech synthesizer for generating voice prompts of textual transcriptions, the voice prompts being usable as acoustic feedback; and
  
  an interface for transmitting the acoustic models and the voice prompts to the mobile terminals.
- View Dependent Claims (11, 14, 15, 16, 17, 18, 19)
- - 11. The network server according to claim 10, further comprising a pronunciation database containing phonetic transcription units, the pronunciation database being shared by the unit for generating acoustic models and the speech synthesizer.
  - 14. The mobile terminal according to claim 13, further comprising at least one of a database for the acoustic models and a database for the textual transcriptions of the spoken utterances.
  - 15. The mobile terminal according to claim 13, wherein the interface allows to transmit the textual transcriptions to the network server.
  - 16. The mobile terminal according to claim 13, further comprising components for outputting at least one of an acoustic and visual feedback for a spoken utterance recognized by the automatic speech recognizer.
  - 17. The mobile terminal according to claim 13, further comprising a database for voice prompts.
  - 18. The mobile terminal according to claim 13, wherein the interface allows to transmit acoustic models of the spoken utterances to the network server.
  - 19. The mobile terminal according to claim 13, wherein the interface allows to transmit voice prompts corresponding to the spoken utterances to the network server.

12. A network server for mobile terminals which are controllable by spoken utterances, comprising:
- a unit for providing acoustic models for automatic recognition of the spoken utterances;
  
  a voice prompt database for storing voice prompts corresponding to the spoken utterances, the voice prompts being utilized as acoustic feedback;
  
  an interface in communication with the unit for providing acoustic models and the voice prompt database, the interface enabling transmission of the acoustic models and the voice prompts to the mobile terminals.

13. A mobile terminal controllable by spoken utterances, comprising:
- an interface for receiving from a network server acoustic models which were created on the basis of textual transcriptions of the spoken utterances, the received acoustic models being comprised of a sequence of phonetic recognition units, each phonetic recognition unit being derived from a corresponding phonetic transcription unit; and
  
  an automatic speech recognizer for recognizing the spoken utterances based on the phonetic recognition units of the received acoustic models.

20. A method for obtaining acoustic models for automatic speech recognition in a mobile terminal controllable by spoken utterances, comprising:
- providing acoustic models by a network server, one or more of the provided acoustic models being obtained by translating a textual transcription of a spoken utterance into a sequence of phonetic transcription units and the sequence of phonetic transcription units into a sequence of phonetic recognition units, the sequence of phonetic recognition units forming the acoustic model of the spoken utterance;
  
  transmitting the acoustic models from the network server to the mobile terminal; and
  
  automatically recognizing the spoken utterances within the mobile terminal based on the phonetic recognition units of the acoustic models transmitted by the network server.
- View Dependent Claims (21, 22, 23, 24, 25, 27)
- - 21. The method according to claim 20, further comprising transmitting textual transcriptions of the spoken utterances from the mobile terminal to the network server and generating the acoustic models based on the transmitted textual transcriptions in the network server.
  - 22. The method according to claim 20, further comprising generating voice prompts.
  - 23. The method according to claim 22, wherein the voice prompts are generated by the network server based on the same phonetic transcriptions used for creating the speaker independent acoustic models.
  - 24. The method according to claim 22, wherein the voice prompts are generated by the mobile terminal based on recognized spoken utterances.
  - 25. The method according to claim 20, further comprising transmitting acoustic models from the mobile terminal to the network server and storing the transmitted acoustic models at least temporarily in the network server.
  - 27. The computer program product of claim 26, stored on a computer readable recording medium.

26. A computer program product comprising program code portions for performing when the computer program product is run on a network server the steps of providing acoustic models, one or more of the provided acoustic models being obtained by translating a textual transcription of a spoken utterance into a sequence of phonetic transcription units and the sequence of phonetic transcription units into a sequence of phonetic recognition units, the sequence of phonetic recognition units forming the acoustic model of the spoken utterance;
- transmitting the acoustic models from the network server to a mobile terminal to enable automatic recognition of the spoken utterances within the mobile terminal based on the phonetic recognition units of the acoustic models transmitted by the network server.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Telefonaktiebolaget LM Ericsson
Original Assignee
Telefonaktiebolaget LM Ericsson
Inventors
Dobler, Stefan, Hellwig, Karl, Oijer, Fredrik

Application Number

US10/013,493
Publication Number

US 20020091511A1
Time in Patent Office

Days
Field of Search
US Class Current

704/201
CPC Class Codes

G10L 15/30   Distributed recognition, e....

G10L 2015/223   Execution procedure of a sp...

H04M 1/271   controlled by voice recogni...

Mobile terminal controllable by spoken utterances

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

27 Claims

Specification

Solutions

Use Cases

Quick Links

Mobile terminal controllable by spoken utterances

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

27 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links