Multi-microphone speech recognition systems and related techniques
First Claim
1. An audio appliance comprising a processor and a memory, wherein the memory stores instructions which, when executed by the processor, cause the audio appliance to:
- receive a plurality of audio signals, wherein each audio signal of the plurality of audio signals is from a respective one of a corresponding plurality of audio devices, wherein each received audio signal corresponds to sound observed by the respective audio device;
process the received audio signals to extract acoustic features;
select one of the audio devices based on the extracted acoustic features;
transmit the received audio signals to a computing component; and
from the computing component, receive recognized speech corresponding to the audio signals.
1 Assignment
0 Petitions
Accused Products
Abstract
A speech recognition system for resolving impaired utterances can have a speech recognition engine configured to receive a plurality of representations of an utterance and concurrently to determine a plurality of highest-likelihood transcription candidates corresponding to each respective representation of the utterance. The recognition system can also have a selector configured to determine a most-likely accurate transcription from among the transcription candidates. As but one example, the plurality of representations of the utterance can be acquired by a microphone array, and beamforming techniques can generate independent streams of the utterance across various look directions using output from the microphone array.
53 Citations
29 Claims
-
1. An audio appliance comprising a processor and a memory, wherein the memory stores instructions which, when executed by the processor, cause the audio appliance to:
-
receive a plurality of audio signals, wherein each audio signal of the plurality of audio signals is from a respective one of a corresponding plurality of audio devices, wherein each received audio signal corresponds to sound observed by the respective audio device; process the received audio signals to extract acoustic features; select one of the audio devices based on the extracted acoustic features; transmit the received audio signals to a computing component; and from the computing component, receive recognized speech corresponding to the audio signals. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method to coordinate operations between or among a plurality of networked computing devices, the method comprising:
-
receiving from a plurality of audio devices a corresponding plurality of audio signals, wherein each audio signal corresponds to sound observed by the respective audio device; processing the plurality of audio signals to resolve which audio device is intended to perform an operation; transmitting a representation of the received audio signals to a speech-recognition module; receiving recognized speech directed to the resolved audio device from the speech-recognition module, wherein the recognized speech corresponds to the representation of the received audio signals. - View Dependent Claims (10, 11, 12, 13, 14)
-
-
15. An audio appliance comprising a processor, a memory, and a communication connection, wherein the memory stores instructions which, when executed by the processor, cause the audio appliance to:
-
receive a representation of an observed sound from each of a plurality of audio devices; extract acoustic features from one or more of the representations of the observed sound; based on the extracted acoustic features, transmit over the communication connection each representation of the observed sound to a processing component; over the communication connection, receive recognized speech from the processing component, wherein the recognized speech corresponds to the observed sound. - View Dependent Claims (16, 17, 18, 19, 20)
-
-
21. An audio appliance comprising a processor and a memory, wherein the memory stores instructions which, when executed by the processor, cause the audio appliance to:
-
receive a plurality of audio signals, wherein each audio signal in the plurality of audio signals is from a respective one of a corresponding plurality of audio devices, wherein each received audio signal corresponds to sound observed by the respective audio device; process the received audio signals to extract acoustic features; based on the extracted acoustic features, select one of the audio devices to receive recognized speech determined by a processing component, wherein the recognized speech corresponds with the plurality of audio signals. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29)
-
Specification