Server-client type speech recognition apparatus and method
First Claim
1. A speech recognition apparatus comprising a terminal-side apparatus (100) and a server-side apparatus (200), wherein said terminal-side apparatus (100) comprises:
- a speech detection portion (101) for detecting a speech interval of received speech data to produce waveform data of the detected speech interval;
a compression portion (102) for compressing the waveform data at the detected speech interval to produce compressed waveform data; and
a waveform transmission portion (103) for transmitting the compressed waveform data to said server-side apparatus, and wherein said server-side apparatus (200) comprises;
a waveform reception portion (201) for receiving the compressed waveform data transmitted from said terminal-side apparatus to produce received waveform data;
a waveform decompression portion (202) for decompressing the received waveform data to produce decompressed waveform data; and
recognizing means (203, 204, 205) for performing recognition processing by using the decompressed waveform data to produce a recognition result.
1 Assignment
0 Petitions
Accused Products
Abstract
To provide a speech recognition apparatus which enables the reduction of transmission time and of costs. A terminal-side apparatus (100) includes a speech detection portion (101) for detecting a speech interval of inputted data, a waveform compression portion (102) for compressing waveform data at the detected speech interval, and a waveform transmission portion (103) for producing the compressed waveform data. A server-side apparatus (200) includes a waveform reception portion (201) for receiving the waveform data transmitted from the terminal-side apparatus, a waveform decompression portion (202) for decompressing the received waveform data, an analyzing portion (203) for analyzing the decompressed waveform data, and a recognizing portion (204) for performing recognition processing to produce a recognition result.
-
Citations
54 Claims
-
1. A speech recognition apparatus comprising a terminal-side apparatus (100) and a server-side apparatus (200), wherein said terminal-side apparatus (100) comprises:
-
a speech detection portion (101) for detecting a speech interval of received speech data to produce waveform data of the detected speech interval;
a compression portion (102) for compressing the waveform data at the detected speech interval to produce compressed waveform data; and
a waveform transmission portion (103) for transmitting the compressed waveform data to said server-side apparatus, and wherein said server-side apparatus (200) comprises;
a waveform reception portion (201) for receiving the compressed waveform data transmitted from said terminal-side apparatus to produce received waveform data;
a waveform decompression portion (202) for decompressing the received waveform data to produce decompressed waveform data; and
recognizing means (203, 204, 205) for performing recognition processing by using the decompressed waveform data to produce a recognition result.
-
-
2. A speech recognition apparatus comprising a terminal-side apparatus (100A) and a server-side apparatus (200A), wherein said terminal-side apparatus (100A) comprises:
-
a waveform and signal reception portion (104) for receiving waveform data of a received speech to produce received waveform data and for receiving a waveform data re-transmission request signal transmitted from said server-side apparatus to produce a received waveform date re-transmission request signal;
a speech section portion (101) for detecting a speech interval of the received waveform data to produce waveform data at the detected speech interval;
a waveform compression portion (102) for compressing the waveform data at the detected speech interval to produce compressed waveform data;
a waveform storing portion (105) for temporarily storing the compressed waveform data as the stored waveform data to simultaneously produce the stored waveform data and for producing the stored waveform data in response to the received waveform data re-transmission request signal; and
a waveform transmission portion (103) for transmitting the stored waveform data to said server-side apparatus, and wherein said server-side apparatus (200A) comprises;
a waveform reception portion (201A) for receiving the compressed waveform data transmitted from said terminal-side apparatus to produce received waveform data and for producing the waveform data re-transmission request signal when the reception of the compressed waveform data fails;
a waveform decompression portion (202) for decompressing the received waveform data to produce decompressed waveform data;
recognizing means (203, 204, 205) for performing recognition processing by using the decompressed waveform data to produce a recognition result; and
a waveform data re-transmission request signal transmission portion (206) for transmitting, to said server-side apparatus 200, the waveform data re-transmission request signal received from said waveform reception portion.
-
-
3. A speech recognition apparatus comprising a terminal-side apparatus (100B, 100C) and a server-side apparatus (200B, 200C), wherein said terminal-side apparatus (100B, 100C) comprises:
-
a waveform and signal reception portion (104) for receiving waveform data of a received speech to produce received waveform data and for receiving a waveform data re-transmission request signal transmitted from said server-side apparatus to produce received waveform data re-transmission request signal;
a speech detection portion (101A) for detecting a speech interval of the received waveform data to produce waveform data at the detected speech interval and for producing a start-point cancel signal when and the speech is detected and thereafter the detection is canceled;
a waveform compression portion (102, 102A) for compressing the waveform data at the detected speech interval to produce compressed waveform data;
a waveform storing portion (105) for temporarily storing the compressed waveform data as the stored waveform data to simultaneously produce the stored waveform data and for producing the stored waveform data in response to the received waveform data re-transmission request signal;
a waveform transmission portion (103) for transmitting the stored waveform data to said server-side apparatus;
a start-point cancel signal transmission portion (106) for transmitting the start-point cancel signal outputted from said speech detection portion to said server-side apparatus, and wherein said server-side apparatus (200B, 200C) comprises;
a waveform and signal reception portion (201B) for receiving compressed waveform data and the start-point cancel signal from said terminal-side apparatus to produce received waveform data and received start-point cancel signal and for producing a waveform data re-transmission request signal when the reception of the compressed waveform data fails;
a waveform decompression portion (202, 202A) for decompressing the received waveform data to produce decompressed waveform data;
recognizing means (203, 204A, 204B, 205) for performing recognition processing by using the decompressed waveform data to produce a recognition result and for stopping the recognition processing in response to the received start-point cancel signal; and
a waveform data re-transmission request signal transmission portion (206) for transmitting, to said server-side apparatus, the waveform data re-transmission request signal from said waveform and signal reception portion. - View Dependent Claims (4)
-
-
5. A speech recognition apparatus comprising a terminal-side apparatus (100D) and a server-side apparatus (200D), wherein said terminal-side apparatus (100D) comprises:
-
a waveform, signal and compressing method reception portion (104A) for receiving at least inputted waveform data, a waveform data re-transmission request signal transmitted from said server-side apparatus, and compressing method information available to said server-side apparatus transmitted from said server-side apparatus to produce received waveform data, a received waveform data re-transmission request signal, and received compressing method information;
a speech detection portion (101A) for detecting a speech interval of the received waveform data to produce waveform data at the detected speech interval;
a compressing method selection portion (110) for selecting an optimum compressing method from the received compressing method information to produce a selected compressing method;
a compressing method index forming portion (109) for forming an index of the selected compressing method to produce a formed compressing method index;
a waveform compression portion (102B) for compressing the waveform data at the detected speech interval to produce compressed waveform data with the formed compressing method index contained in a part of the compressed waveform data;
a waveform storing portion (105) for temporarily storing the compressed waveform data as the stored waveform data to produce simultaneously the stored waveform data and for producing the stored waveform data in response to the received waveform data re-transmission request signal;
a waveform transmission portion (103) for transmitting the stored waveform data to said server-side apparatus; and
a compressing method request signal transmission portion (112) for transmitting a compressing method request signal to said server-side apparatus, wherein said server-side apparatus (200D) comprises;
a waveform and signal reception portion (201C) for receiving compressed waveform data and the compressing method request signal transmitted from said terminal-side apparatus to produce received waveform data and a received compressing method request signal and for producing a waveform data re-transmission request signal when the reception of the compressed waveform data fails;
a waveform decompression portion (202B) for decompressing the received waveform data to produce decompressed waveform data;
recognizing means (203A, 204C, 205A) for performing recognition processing by using the decompressed waveform data to produce a recognition result;
a waveform data re-transmission request signal transmission portion (206) for transmitting, to said server-side apparatus, the waveform data re-transmission request signal outputted from said waveform and signal reception portion;
a compressing method storing portion (212) for storing compressing method information available to said server-side apparatus;
a compressing method obtaining portion (211) for obtaining, in response to the received compressing method request signal, the compressing method information stored in said compressing method storing portion to transmit the compressing method information to said terminal-side apparatus;
a compressing method index obtaining portion (208) for obtaining an index of the compressing method from the decompressed waveform data to produce an obtained compressing method index;
a recognition engine selection portion (210) for selecting a recognition engine from the obtained compressing method index to produce a selected engine; and
a recognition engine setting portion (210) for setting the selected engine to said recognizing means from stored engines. - View Dependent Claims (6)
-
-
7. A speech recognition apparatus comprising a terminal-side apparatus (100E) and a server-side apparatus (200E), wherein said terminal-side apparatus (100E) comprises:
-
a waveform, signal and compressing method reception portion (104A) for receiving at least inputted waveform data, a waveform data re-transmission request signal transmitted from said server-side apparatus, and compressing method information available to said server-side apparatus transmitted from said server-side apparatus to produce received waveform data, a received waveform data re-transmission request signal, and received compressing method information;
a speech detection portion (101A) for detecting a speech interval of the received waveform data to produce waveform data at the detected speech interval and for producing a start-point cancel signal when the speech is detected and thereafter the detection is canceled;
a compressing method selection portion (110) for selecting an optimum compressing method from the received compressing method information to produce a selected compressing method;
a compassing method index forming portion (109) for forming an index of the selected compressing method to produce a formed compressing method index;
a waveform compression portion (102B) for compressing the waveform data at the detected speech interval to produce compressed waveform data with the formed compressing method index contained in a part of the compressed waveform data;
a waveform storing portion (105) for temporarily storing the compressed waveform data as the stored waveform data to produce simultaneously the stored waveform data and for producing the stored waveform data in response to the received waveform data re-transmission request signal;
a waveform transmission portion (103) for transmitting the stored waveform data to said server-side apparatus;
a start-point cancel signal transmission portion (106) for transmitting, to said server-side apparatus, the start-point cancel signal outputted from said speech detection portion; and
a compressing method request signal transmission portion (112) for transmitting a compressing method request signal to said server-side apparatus, wherein said server-side apparatus (200E) comprises;
a waveform, signal, and task information reception portion (201D) for receiving compressed waveform data, the start-point cancel signal, the compressing method request signal from said terminal-side apparatus, and task information transmitted from a contents side to produce received waveform data, a received start-point cancel signal, a received compressing method request signal, and a received task information, and for producing a waveform data re-transmission request signal when the reception of the compressed waveform data fails;
a waveform decompression portion (202B) for decompressing the received waveform data to produce decompressed waveform data;
recognizing means (203A, 204C, 205A) for performing recognition processing by using the decompressed waveform data to produce a recognition result and for stopping the recognition processing in response to the received start-point cancel signal;
a waveform data re-transmission request signal transmission portion (206) for transmitting, to said server-side apparatus, the waveform data re-transmission request signal outputted from said waveform and signal reception portion;
a task information storing portion (213) for storing the received task information to produce stored task information;
a compressing method and task information corresponding table storing portion (212A) for storing the task information and one or more compressing methods available to the use of a task;
a compressing method obtaining portion (211A) for obtaining, in response to the received compressing method request signal, available compressing method information from the stored task information and the corresponding table between the task information and the compressing method transmitted from said compressing method and task information corresponding table storing portion to transmitting the compressing method information to said terminal-side apparatus;
a compressing method index obtaining portion (208) for obtaining an index of the compressing method from the decompressed waveform data to produce an obtained compressing method index;
a recognition engine selection portion (209) for selecting a recognition engine from the obtained compressing method index to produce a selected engine; and
a recognition engine setting portion (210) for setting the selected engine to said recognizing means from stored engines. - View Dependent Claims (8)
-
-
9. A speech recognition apparatus comprising a terminal-side apparatus (100F) and a server-side apparatus (200F), wherein said terminal-side apparatus (100F) comprises:
-
a waveform, signal, compressing method, and task information reception portion (104B) for receiving inputted waveform data, task information transmitted from the contents side, a waveform data re-transmission request signal transmitted from said server-side apparatus, and compressing method information available to said server-side apparatus transmitted from said server-side apparatus to produce received waveform data, received task information, received waveform data re-transmission request signal, and received compressing method information;
a task information storing portion (113) for storing the received task information to produce stored task information;
a compressing method and task information corresponding table storing portion (111A) for storing a corresponding table between the task information and at least one ore more compressing methods available to the use a task;
a compressing method selection portion (110A) for selecting, in response to the received compressing method information, an optimum compressing method based on the stored task information and the corresponding table between the task information and the compressing method transmitted from said compressing method and task information corresponding table storing portion to produce a selected compressing method;
a compressing method index forming portion (109) for forming an index of the selected compressing method to produce a formed compressing method index;
a speech detection portion (101A) for detecting a speech interval of the received waveform data to produce waveform data at the detected speech interval;
a waveform compressing portion (102B) for compressing the waveform data at the detected speech interval to produce compressed waveform data with the formed compressing method index contained in a part of the compressed waveform data;
a waveform storing portion (105) for temporarily storing t the compressed waveform data as the stored waveform data to produce the stored waveform data and for producing the stored waveform data in response to the received waveform data re-transmission request signal;
a waveform transmission portion (103) for transmitting the stored waveform data to said server-side apparatus; and
a compressing method request signal transmission portion (112) for transmitting a compressing method request signal to said server-side apparatus, wherein said server-side apparatus (200F) comprises;
a waveform and signal reception portion (201C) for receiving compressed waveform data transmitted from said terminal-side apparatus and the compressing method request signal to produce received waveform data and a received compressing method request signal and for producing a waveform data re-transmission request signal when the reception of the compressed waveform data fails;
a waveform decompression portion (202B) for decompressing the received waveform data to produce decompressed waveform data;
recognizing means (203A, 204C, 205A) for performing recognition processing by using the decompressed waveform data to produce a recognition result;
a waveform data re-transmission request signal transmission portion (206) for transmitting, to said server-side apparatus, the waveform data re-transmission request signal outputted from said waveform and signal reception portion;
a compressing method storing portion (212) for storing information on the compressing methods available to said server-side apparatus;
a compressing method obtaining portion (211) for obtaining, in response to the received compressing method request signal, the compressing method information stored in said compressing method storing portion to transmit the compressing method information to said terminal-side apparatus;
a compressing method index obtaining portion (208) for obtaining an index of the compressing method from the decompressed waveform data to produce obtained compressing method index;
a recognition engine selection portion (210) for selecting a recognition engine from the obtained compressing method index to produce a selected engine; and
a recognition engine setting portion (210) for setting the selected engine to said recognizing means from stored engines. - View Dependent Claims (10)
-
-
11. A terminal (100) connected to a server apparatus (200) which receives and decompresses compressed waveform data transmitted therefrom, performs recognition processing by using the decompressed waveform data, and produces a recognition result, said terminal and said server apparatus (200) constituting a server-client speech recognition apparatus, said terminal comprising:
-
a speed detection portion (101) for detecting a speech interval of inputted speech data to produce waveform data of the detected speech interval;
a waveform compression portion (102) for compressing the waveform data at the detected speech interval to produce compressed waveform data; and
a waveform transmission portion (103) for transmitting the compressed waveform data to said server apparatus.
-
-
12. A terminal (100A, 100B, 100C, 100D, 100F) connected to a server apparatus (200A) which receives and decompresses compressed waveform data transmitted therefrom, performs recognition processing by using the decompressed waveform data, and produce a recognition result, said terminal (100A, 100B, 100C, 100D, 100F) and said server apparatus constituting a server-client speech recognition apparatus, said terminal comprising:
-
a waveform and signal reception portion (104, 104A, 104B) for receiving waveform data of an inputted speech and a waveform data re-transmission request signal transmitted from said server apparatus to produce received waveform data and a received waveform data re-transmission request signal;
a speech detection portion (101, 101A) for detecting a speech interval of the received waveform data to produce waveform data at the detected speech interval;
a waveform compressing portion (102, 102A, 102B) for compressing the waveform data at the detected speech interval to produce compressed waveform data;
a waveform storing portion (105) for temporarily storing the compressed waveform data to produce stored waveform data and for producing the stored waveform data in response to the received waveform data re-transmission request signal; and
a waveform transmission portion (103) for transmitting the stored waveform data to the server apparatus. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
-
-
24. A server apparatus (200, 200A, 200B, 200C, 200D, 200E) connected to a terminal (100, 100A, 100B, 100C, 100D, 100E) which detects a speech interval of inputted data, compresses waveform data at the detected speech interval, and transmits the compressed waveform data, said server apparatus and said terminal constituting a server-client speech recognition apparatus, said server apparatus comprising:
-
a reception portion (201, 201A, 201B, 201C, 201D) for receiving the waveform data transmitted from said terminal to produce the received waveform data;
a waveform decompression portion (202, 202A, 202B) for decompressing the received waveform data to produce decompressed waveform data; and
recognizing means (203, 201A, 204, 204A, 204B, 204C, 205, 205A) for performing recognition processing by using the decompressed waveform data to produce a recognition result. - View Dependent Claims (25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36)
-
-
37. A speech recognition method of a server-client system comprising a server apparatus (200) and a terminal (100, 100B), said speech recognition method comprising:
-
in said terminal (100, 100B), a step (101) of detecting a speech interval of inputted data;
a step (102) of compressing waveform data of the detected speech interval; and
a step (103) of transmitting the compressed waveform data to said server apparatus, and in said server apparatus (200, 200B), a step (201) of receiving the waveform data outputted from said terminal;
a step (202) of decompressing the received waveform data; and
a step (203, 204, 204A, 205) of performing recognition processing by using the decompressed waveform data to produce a recognition result. - View Dependent Claims (39)
-
-
38. A speech recognition method of a server-client system comprising a server apparatus (200A, 200B, 200C, 200D, 200E, 200F) and a terminal (100A, 100B, 100C, 100D, 100E, 100F), said speech recognition method comprising:
-
in said terminal (100A, 100B, 100C, 100D, 100F), a step (104, 104A, 104B) of receiving waveform data of an inputted speech;
a step (101, 101A) of detecting a speech interval of the received waveform data;
a step (102,102B) of compressing the waveform data of the detected speech interval;
a step (103) of temporarily storing the compressed waveform data into a waveform storing portion (105) to transmit the compressed waveform data to said server apparatus;
a step (104, 104A, 104B, 103) of transmitting, to said server apparatus, the waveform data stored in said waveform storing portion (105) on reception of a waveform data re-transmission request signal transmitted from said serer apparatus, and in said server apparatus (200A, 200B, 200C, 200D, 200E), a step (201A, 201B, 201C, 201D) of receiving the waveform data outputted from said terminal;
a step (202, 202B) of decompressing the received waveform data;
a step (203, 203A, 204, 204A, 204C, 205, 205A) of performing recognition processing by using the decompressed waveform data to produce a recognition result; and
a step (206) of transmitting the waveform data re-transmission request signal to said terminal when the reception of the compressed waveform data transmitted from said terminal fails. - View Dependent Claims (40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54)
-
Specification