Terminal device, server device and speech recognition method
First Claim
1. A terminal device, comprising:
- a transmitting means for transmitting a voice produced by a user and environmental noises to a server device;
a receiving means for receiving from the server device an acoustic model adapted to the voice of the user and the environmental noises;
a first storage means for storing the acoustic model received by the receiving means; and
a speech recognition means for conducting speech recognition using the acoustic model stored in the first storage means.
1 Assignment
0 Petitions
Accused Products
Abstract
Voice of a user having noises added thereto (noise-added voice) is input by a terminal device and transmitted to a server device. A plurality of acoustic models are stored in advance in a data storage section of the server device. An adapted-model selecting section of the server device selects an acoustic model which is the best adapted to the noise-added voice received by a receiving section from the acoustic models stored in the data storage section. A transmitting section transmits the selected adapted model to the terminal device. A receiving section of the terminal device receives the adapted model from the server device. The received adapted model is stored in a memory. A speech recognition section conducts speech recognition using the adapted model stored in the memory.
104 Citations
23 Claims
-
1. A terminal device, comprising:
-
a transmitting means for transmitting a voice produced by a user and environmental noises to a server device;
a receiving means for receiving from the server device an acoustic model adapted to the voice of the user and the environmental noises;
a first storage means for storing the acoustic model received by the receiving means; and
a speech recognition means for conducting speech recognition using the acoustic model stored in the first storage means. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A terminal device, comprising:
-
a transmitting means for transmitting a voice produced by a user and environmental noises to a server device;
a receiving means for receiving from the server device acoustic-model producing data for producing an acoustic model adapted to the voice of the user and the environmental noises;
a first storage means for storing the acoustic-model producing data received by the receiving means;
a producing means for producing the acoustic model adapted to the voice of the user and the environmental noises by using the acoustic-model producing data stored in the first storage means; and
a speech recognition means for conducting speech recognition using the acoustic model produced by the producing means. - View Dependent Claims (8, 9)
-
-
10. A server device, comprising:
-
a storage means for storing a plurality of acoustic models each adapted to a corresponding speaker and a corresponding environment;
a receiving means for receiving from a terminal device a voice produced by a user and environmental noises;
a selecting means for selecting from the storage means an acoustic model which is adapted to the voice of the user and the environmental noises received by the receiving means; and
a transmitting means for transmitting the acoustic model selected by the selecting means to the terminal device. - View Dependent Claims (11, 12, 13)
-
-
14. A server device, comprising:
-
a storage means for storing a plurality of acoustic models each adapted to a corresponding speaker and a corresponding environment;
a receiving means for receiving from a terminal device a voice produced by a user and environmental noises;
a producing means for producing an acoustic model adapted to the voice of the user and the environmental noises, based on the voice of the user and the environmental noises received by the receiving means and the plurality of acoustic models stored in the storage means; and
a transmitting means for transmitting the acoustic model produced by the producing means to the terminal device. - View Dependent Claims (15, 16, 17)
-
-
18. A server device, comprising:
-
a storage means for storing a plurality of acoustic models each adapted to a corresponding speaker and a corresponding environment;
a receiving means for receiving from a terminal device a voice produced by a user and environmental noises;
a selecting means for selecting from the storage means acoustic-model producing data for producing an acoustic model which is adapted to the voice of the user and the environmental noises received by the receiving means; and
a transmitting means for transmitting the acoustic-model producing data selected by the selecting means to the terminal device. - View Dependent Claims (19, 20, 21)
-
-
22. A speech recognition method, comprising the steps of:
-
preparing a plurality of acoustic models each adapted to a corresponding speaker, a corresponding environment, and a corresponding tone of voice;
obtaining an acoustic model adapted to a voice produced by a user and environmental noises, based on the voice of the user, the environmental noises and the plurality of acoustic models; and
conducting speech recognition using the obtained acoustic model. - View Dependent Claims (23)
-
Specification