Configurable speech recognition system using multiple recognizers
First Claim
1. A method of performing speech recognition in a distributed speech recognition system comprising an electronic device including an embedded speech recognizer and a network device including a remote speech recognizer remote from the electronic device, the method comprising:
- receiving, by the electronic device, input audio comprising speech;
transmitting at least a portion of the input audio to the network device for processing by the remote speech recognizer to produce a remote speech recognition result;
processing, by the embedded speech recognizer, at least a portion of the input audio to produce a local speech recognition result;
identifying based on a command word included in the local speech recognition result, a mobile phone application;
performing a partial action on the electronic device, based, at least in part, on the local speech recognition result,wherein performing a partial action comprises starting the identified mobile phone application on the electronic device, wherein performing the partial action is initiated prior to receiving the remote speech recognition result from the network device;
receiving, from the network device, the remote speech recognition result; and
performing a full action on the electronic device that completes the partial action based, at least in part, and responsive to receiving the remote speech recognition result from the network device.
3 Assignments
0 Petitions
Accused Products
Abstract
Techniques for combining the results of multiple recognizers in a distributed speech recognition architecture. Speech data input to a client device is encoded and processed both locally and remotely by different recognizers configured to be proficient at different speech recognition tasks. The client/server architecture is configurable to enable network providers to specify a policy directed to a trade-off between reducing recognition latency perceived by a user and usage of network resources. The results of the local and remote speech recognition engines are combined based, at least in part, on logic stored by one or more components of the client/server architecture.
-
Citations
17 Claims
-
1. A method of performing speech recognition in a distributed speech recognition system comprising an electronic device including an embedded speech recognizer and a network device including a remote speech recognizer remote from the electronic device, the method comprising:
-
receiving, by the electronic device, input audio comprising speech; transmitting at least a portion of the input audio to the network device for processing by the remote speech recognizer to produce a remote speech recognition result; processing, by the embedded speech recognizer, at least a portion of the input audio to produce a local speech recognition result; identifying based on a command word included in the local speech recognition result, a mobile phone application; performing a partial action on the electronic device, based, at least in part, on the local speech recognition result, wherein performing a partial action comprises starting the identified mobile phone application on the electronic device, wherein performing the partial action is initiated prior to receiving the remote speech recognition result from the network device; receiving, from the network device, the remote speech recognition result; and performing a full action on the electronic device that completes the partial action based, at least in part, and responsive to receiving the remote speech recognition result from the network device. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A non-transitory computer-readable storage medium encoded with a plurality of instructions that, when executed by at least one processor on an electronic device in a distributed speech recognition system comprising the electronic device having an embedded speech recognizer and a network device having a remote speech recognizer remote from the electronic device, perform a method comprising:
-
receiving, by the electronic device, input audio comprising speech; transmitting at least a portion of the input audio to the network device for processing by the remote speech recognizer to produce a remote speech recognition result; processing, by the embedded speech recognizer, at least a portion of the input audio to produce a local speech recognition result; identifying, based on a command word included in the local speech recognition result, a mobile phone application; performing a partial action on the electronic device, based, at least in part, on the local speech recognition result, wherein performing a partial action comprises starting the identified mobile phone application on the electronic device, wherein performing the partial action is initiated prior to receiving the remote speech recognition result from the network device; receiving, from the network device, the remote speech recognition result; and performing a full action on the electronic device that completes the partial action based, at least in part, and responsive to receiving the remote speech recognition result from the network device. - View Dependent Claims (13, 14)
-
-
15. An electronic device for use in a distributed speech recognition system comprising the electronic device and a network device remote from the electronic device, the electronic device, comprising:
-
at least one storage device configured to store one or more applications; an embedded speech recognizer configured to; receive input audio comprising speech; transmit at least a portion of the input audio to the network device for processing by the remote speech recognizer to produce a remote speech recognition result; process at least a portion of the input audio to produce a local speech recognition result; identify, based on a command word included in the local speech recognition result, a partial action to be performed; perform, in response to producing the local speech recognition result, the partial action on the electronic device, based, at least in part, on the local speech recognition result; provide an indication of the performed partial action on the electronic device to enable a user of the electronic device to observe the results of the partial action prior to receiving the remote speech recognition result, wherein performing the partial action and providing the indication of the performed partial action are initiated prior to receiving the remote speech recognition result from the network device; and perform a full action on the electronic device that completes the partial action based, at least in part, and responsive to receiving the remote speech recognition result from the network device. - View Dependent Claims (16, 17)
-
Specification