Method and apparatus for transmitting speech activity in distributed voice recognition systems
First Claim
1. A method of operating a speech recognition system employed by a wireless subscriber station, comprising:
- receiving an acoustic speech signal, including periods of speech and non-speech, from a user of the wireless subscriber station;
converting the acoustic speech signal to an electrical speech signal;
assembling detected voice activity information related to the electrical speech signal;
identifying feature extraction information related to the electrical speech signal;
selectively utilizing said detected voice activity information and said feature extraction information to form advanced front end data;
transmitting the detected voice activity information over a first wireless communication channel to a wireless base station, andtransmitting the feature extraction information over a second wireless communication channel, separate from the first wireless communication channel, to the wireless base station.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for transmitting speech activity in a distributed voice recognition system. The distributed voice recognition system includes a local VR engine in a subscriber unit and a server VR engine on a server. The local VR engine comprises an advanced feature extraction (AFE) module that extracts features from a speech signal, and a voice activity detection (VAD) module that detects voice activity within a speech signal. The combined results from the VAD module and feature extraction module are provided in an efficient manner to a remote device, such as a server, in the form of advanced front end features, thereby enabling the server to process speech segments free of silence regions. Various aspects of efficient speech segment transmission are disclosed.
100 Citations
34 Claims
-
1. A method of operating a speech recognition system employed by a wireless subscriber station, comprising:
-
receiving an acoustic speech signal, including periods of speech and non-speech, from a user of the wireless subscriber station; converting the acoustic speech signal to an electrical speech signal; assembling detected voice activity information related to the electrical speech signal; identifying feature extraction information related to the electrical speech signal; selectively utilizing said detected voice activity information and said feature extraction information to form advanced front end data; transmitting the detected voice activity information over a first wireless communication channel to a wireless base station, and transmitting the feature extraction information over a second wireless communication channel, separate from the first wireless communication channel, to the wireless base station. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A wireless subscriber station, comprising:
-
a microphone for receiving an acoustic speech signal, including periods of speech and non-speech, from a user of the wireless subscriber station, and for converting the acoustic speech signal to an electrical speech signal; a voice activity detector for detecting voice activity information related to the electrical speech signal; a feature extractor, operating substantially in parallel to the voice activity detector, for identifying feature extraction information related to the electrical speech signal; a processor for selectively utilizing the detected voice activity information and the feature extraction information to form advanced front end data; and a transmitter for transmitting the detected voice activity information over a first wireless communication channel to a wireless base station, and transmitting the feature extraction information over a second wireless communication channel, separate from the first wireless communication channel, to the wireless base station. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. A method of operating a distributed speech recognition system employed by a wireless subscriber station, comprising:
-
receiving an acoustic speech signal, including periods of speech and non-speech, from a user of the wireless subscriber station; converting the acoustic speech signal to electrical speech data; extracting voice activity data from the electrical speech data; identifying feature extraction data from the electrical speech data; and transmitting the detected voice activity information over a first wireless communication channel to a wireless base station, and transmitting the feature extraction information over a second wireless communication channel, separate from the first wireless communication channel, to the wireless base station. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29, 30, 31, 32)
-
-
33. A method of operating a distributed speech recognition service, comprising:
-
performing, by a wireless subscriber station, a first portion of the distributed speech recognition service, comprising; receiving an acoustic speech signal, including periods of speech and non-speech, from a user of the wireless subscriber station; converting the acoustic speech signal to an electrical speech signal; assembling detected voice activity information related to the electrical speech signal; identifying feature extraction information related to the electrical speech signal; selectively utilizing the detected voice activity information and the feature extraction information; and transmitting the detected voice activity information over a first wireless communication channel to a wireless base station, and transmitting the feature extraction information over a second wireless communication channel, separate from the first wireless communication channel, to the wireless base station; and performing, by a wireless base station, a second portion of the distributed speech recognition service, comprising; receiving the detected voice activity information over the first wireless communication channel and the feature extraction information over the second wireless communication channel; determining a linguistic estimate of the electrical speech signal responsive to the detected voice activity information over the first wireless communication channel and the feature extraction information over the second wireless communication channel; and transmitting information over a third wireless communication channel from the wireless base station to the wireless subscriber station responsive to the linguistic estimate of the electrical speech signal for controlling the wireless subscriber station.
-
-
34. A method of operating a speech recognition service employed by a wireless based station, comprising:
-
receiving from a wireless subscriber station advanced front end data, including detected voice activity information sent over a first wireless communication channel and feature extraction information send over a second wireless communication channel, separate from the first wireless communication channel, wherein the wireless subscriber station comprises; receiving an acoustic speech signal, including periods of speech and non-speech, from a user of the wireless subscriber station; converting the acoustic speech signal to an electrical speech signal; assembling the detected voice activity information related to the electrical speech signal; identifying feature extraction information related to the electrical speech signal; selectively utilizing the detected voice activity information and the feature extraction information to form the advanced front end data; and determining a linguistic estimate of the electrical speech signal responsive to receiving the advanced front end data; and transmitting information over a third wireless communication channel from the wireless base station to the wireless subscriber station responsive to the linguistic estimate of the electrical speech signal for controlling the wireless subscriber station.
-
Specification