Direction based end-pointing for speech recognition
First Claim
Patent Images
1. A system, comprising:
- at least one processor; and
at least one computer-readable medium encoded with instructions which, when executed by the at least one processor, cause the system to;
receive audio signals from a microphone array, the audio signals representing at least first speech of a first user,determine, using the audio signals, that first audio originated from a first direction,process the audio signals to generate a first audio signal corresponding to the first direction,determine, based on at least one characteristic of the first audio signal, that the first audio signal represents second speech of the first user, andbased at least in part on determining that the first audio signal represents the second speech, cause speech recognition processing to be performed using the first audio signal to determine first text corresponding to at least a portion of the second speech.
1 Assignment
0 Petitions
Accused Products
Abstract
A speech recognition system utilizing automatic speech recognition techniques such as end-pointing techniques in conjunction with beamforming and/or signal processing to isolate speech from one or more speaking users from multiple received audio signals and to detect the beginning and/or end of the speech based at least in part on the isolation. Audio capture devices such as microphones may be arranged in a beamforming array to receive the multiple audio signals. Multiple audio sources including speech may be identified in different beams and processed.
-
Citations
20 Claims
-
1. A system, comprising:
-
at least one processor; and at least one computer-readable medium encoded with instructions which, when executed by the at least one processor, cause the system to; receive audio signals from a microphone array, the audio signals representing at least first speech of a first user, determine, using the audio signals, that first audio originated from a first direction, process the audio signals to generate a first audio signal corresponding to the first direction, determine, based on at least one characteristic of the first audio signal, that the first audio signal represents second speech of the first user, and based at least in part on determining that the first audio signal represents the second speech, cause speech recognition processing to be performed using the first audio signal to determine first text corresponding to at least a portion of the second speech. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method, comprising:
-
receiving audio signals from a microphone array, the audio signals representing at least first speech of a first user; determining, using the audio signals, that first audio originated from a first direction; processing the audio signals to generate a first audio signal corresponding to the first direction; determining, based on at least one characteristic of the first audio signal, that the first audio signal represents second speech of the first user; and based at least in part on determining that the first audio signal represents the second speech, causing speech recognition processing to be performed using the first audio signal to determine first text corresponding to at least a portion of the second speech. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification