Multi-microphone speech recognition systems and related techniques

US 10,614,812 B2
Filed: 04/19/2019
Issued: 04/07/2020
Est. Priority Date: 06/06/2015
Status: Active Grant

First Claim

Patent Images

1. An audio appliance comprising a processor and a memory, wherein the memory stores instructions which, when executed by the processor, cause the audio appliance to:

receive a plurality of audio signals, wherein each audio signal of the plurality of audio signals is from a respective one of a corresponding plurality of audio devices, wherein each received audio signal corresponds to sound observed by the respective audio device;

process the received audio signals to extract acoustic features;

select one of the audio devices based on the extracted acoustic features;

transmit the received audio signals to a computing component; and

from the computing component, receive recognized speech corresponding to the audio signals.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech recognition system for resolving impaired utterances can have a speech recognition engine configured to receive a plurality of representations of an utterance and concurrently to determine a plurality of highest-likelihood transcription candidates corresponding to each respective representation of the utterance. The recognition system can also have a selector configured to determine a most-likely accurate transcription from among the transcription candidates. As but one example, the plurality of representations of the utterance can be acquired by a microphone array, and beamforming techniques can generate independent streams of the utterance across various look directions using output from the microphone array.

53 Citations

View as Search Results

29 Claims

1. An audio appliance comprising a processor and a memory, wherein the memory stores instructions which, when executed by the processor, cause the audio appliance to:
- receive a plurality of audio signals, wherein each audio signal of the plurality of audio signals is from a respective one of a corresponding plurality of audio devices, wherein each received audio signal corresponds to sound observed by the respective audio device;
  
  process the received audio signals to extract acoustic features;
  
  select one of the audio devices based on the extracted acoustic features;
  
  transmit the received audio signals to a computing component; and
  
  from the computing component, receive recognized speech corresponding to the audio signals.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The audio appliance according to claim 1, wherein the instructions, when executed by the processor, further cause the audio appliance to transmit a command associated with the recognized speech to the selected audio device.
  - 3. The audio appliance according to claim 1, wherein the instructions, when executed by the processor, further cause the audio appliance to jointly process the received audio signals with the selected audio device.
  - 4. The audio appliance according to claim 1, wherein the acoustic features correspond to an utterance.
  - 5. The audio appliance according to claim 1, wherein the audio appliance further comprises the computing component.
  - 6. The audio appliance according to claim 1, wherein the computing component is remote from the audio appliance.
  - 7. The audio appliance according to claim 1, wherein the plurality of audio devices are spatially distributed.
  - 8. The audio appliance according to claim 1, wherein the selected audio device is a first audio device, wherein the instructions, when executed, further cause the audio appliance to select a second audio device and to coordinate operations between the first audio device and the second audio device.

9. A method to coordinate operations between or among a plurality of networked computing devices, the method comprising:
- receiving from a plurality of audio devices a corresponding plurality of audio signals, wherein each audio signal corresponds to sound observed by the respective audio device;
  
  processing the plurality of audio signals to resolve which audio device is intended to perform an operation;
  
  transmitting a representation of the received audio signals to a speech-recognition module;
  
  receiving recognized speech directed to the resolved audio device from the speech-recognition module, wherein the recognized speech corresponds to the representation of the received audio signals.
- View Dependent Claims (10, 11, 12, 13, 14)
- - 10. The method according to claim 9, wherein each transmitted representation of the received audio signals comprises one or more of a stream of encoded audio, a plurality of acoustic features, an utterance representation, a plurality of transcription candidates, and a measure of transcription quality or accuracy.
  - 11. The method according to claim 9, wherein the recognized speech comprises a command to invoke the operation.
  - 12. The method according to claim 9, wherein the act of processing the plurality of audio signals comprises jointly processing the plurality of audio signals.
  - 13. The method according to claim 9, wherein the act of transmitting the representation of the received audio signals to a speech-recognition module comprises transmitting the representation over a network connection to a network-connected computing environment.
  - 14. The method according to claim 9, wherein the act of transmitting the representation of the received audio signals to a speech-recognition module comprises communicating the representation over a bus to a computing component.

15. An audio appliance comprising a processor, a memory, and a communication connection, wherein the memory stores instructions which, when executed by the processor, cause the audio appliance to:
- receive a representation of an observed sound from each of a plurality of audio devices;
  
  extract acoustic features from one or more of the representations of the observed sound;
  
  based on the extracted acoustic features,transmit over the communication connection each representation of the observed sound to a processing component;
  
  over the communication connection, receive recognized speech from the processing component, wherein the recognized speech corresponds to the observed sound.
- View Dependent Claims (16, 17, 18, 19, 20)
- - 16. The appliance according to claim 15, wherein the instructions, when executed by the processor, further cause the audio appliance to issue a command to a target device, wherein the command corresponds to the recognized speech and causes the target device to invoke a task.
  - 17. The appliance according to claim 15, wherein the audio appliance comprises the processing component.
  - 18. The appliance according to claim 15, wherein the processing component is remote from the audio appliance.
  - 19. The appliance according to claim 15, wherein the instructions, when executed by the processor, further cause the audio appliance to jointly process each received representation of the observed sound to extract the acoustic features.
  - 20. The appliance according to claim 15, wherein the plurality of audio devices are spatially distributed.

21. An audio appliance comprising a processor and a memory, wherein the memory stores instructions which, when executed by the processor, cause the audio appliance to:
- receive a plurality of audio signals, wherein each audio signal in the plurality of audio signals is from a respective one of a corresponding plurality of audio devices, wherein each received audio signal corresponds to sound observed by the respective audio device;
  
  process the received audio signals to extract acoustic features;
  
  based on the extracted acoustic features, select one of the audio devices to receive recognized speech determined by a processing component, wherein the recognized speech corresponds with the plurality of audio signals.
- View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29)
- - 22. The audio appliance according to claim 21, wherein the instructions, when executed by the processor, further cause the audio appliance to process the received audio signals to extract acoustic features corresponding to each respective audio signal.
  - 23. The audio appliance according to claim 21, wherein the instructions, when executed by the processor, further cause the audio appliance to synchronize the audio signals with each other.
  - 24. The audio appliance according to claim 21, wherein the instructions, when executed by the processor, further cause the audio appliance to jointly consider the extracted acoustic features and, based on the joint consideration, to select the audio device to receive the recognized speech.
  - 25. The audio appliance according to claim 21, wherein the acoustic features correspond to an utterance.
  - 26. The audio appliance according to claim 21, wherein the audio appliance further comprises the processing component.
  - 27. The audio appliance according to claim 21, wherein the processing component is remote from the audio appliance.
  - 28. The audio appliance according to claim 21, wherein the plurality of audio devices are spatially distributed.
  - 29. The audio appliance according to claim 21, wherein the selected audio device is a first audio device, wherein the instructions, when executed, further cause the audio appliance to select a second audio device and to coordinate operations between the first audio device and the second audio device.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Apple Inc.
Original Assignee
Apple Inc.
Inventors
Ramprashad, Sean A., Thornburg, Harvey D., Krishnaswamy, Arvindh, Lindahl, Aram M.
Primary Examiner(s)
Singh, Satwant K

Application Number

US16/389,697
Publication Number

US 20190251974A1
Time in Patent Office

354 Days
Field of Search

None
US Class Current
CPC Class Codes

G10L 15/20 Speech recognition techniqu...

G10L 15/32 Multiple recognisers used i...

Multi-microphone speech recognition systems and related techniques

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

53 Citations

29 Claims

Specification

Solutions

Use Cases

Quick Links

Multi-microphone speech recognition systems and related techniques

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

53 Citations

29 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links