Multi-sensory speech detection system
First Claim
Patent Images
1. A speech recognition system, comprising:
- an audio microphone outputting a microphone signal based on a sensed audio input;
a speech sensor outputting a sensor signal based on a non-audio input generated by speech action;
a speech detector component outputting a speech detection signal indicative of a probability that a user is speaking based on the microphone signal and based on a level of variance in a first characteristic of the sensor signal and based on the microphone signal, wherein the first characteristic of the sensor signal has a first level of variance when the user is speaking and a second level of variance when the user is not speaking and wherein the speech detector component outputs the speech detection signal based on the level of variance of the first characteristic of the sensor signal relative to a baseline level of variance of the first characteristic that comprises a level of a predetermined one of the first and second levels of the characteristic over a give time period the speech detection component further calculating a combined signal by multiplying the speech detection signal by the microphone signal; and
a speech recognizer recognizing speech to provide a recognition output indicative of speech in the microphone signal based on the combined signal, wherein recognizing speech comprises;
increasing a likelihood that speech is recognized by an amount based on a probability that the speech detection signal indicates that the user is speaking; and
decreasing a likelihood that speech is recognized by an amount based on a probability that the speech detection signal indicates that the speaker is not speaking.
2 Assignments
0 Petitions
Accused Products
Abstract
The present invention combines a conventional audio microphone with an additional speech sensor that provides a speech sensor signal based on an input. The speech sensor signal is generated based on an action undertaken by a speaker during speech, such as facial movement, bone vibration, throat vibration, throat impedance changes, etc. A speech detector component receives an input from the speech sensor and outputs a speech detection signal indicative of whether a user is speaking. The speech detector generates the speech detection signal based on the microphone signal and the speech sensor signal.
-
Citations
13 Claims
-
1. A speech recognition system, comprising:
-
an audio microphone outputting a microphone signal based on a sensed audio input; a speech sensor outputting a sensor signal based on a non-audio input generated by speech action; a speech detector component outputting a speech detection signal indicative of a probability that a user is speaking based on the microphone signal and based on a level of variance in a first characteristic of the sensor signal and based on the microphone signal, wherein the first characteristic of the sensor signal has a first level of variance when the user is speaking and a second level of variance when the user is not speaking and wherein the speech detector component outputs the speech detection signal based on the level of variance of the first characteristic of the sensor signal relative to a baseline level of variance of the first characteristic that comprises a level of a predetermined one of the first and second levels of the characteristic over a give time period the speech detection component further calculating a combined signal by multiplying the speech detection signal by the microphone signal; and a speech recognizer recognizing speech to provide a recognition output indicative of speech in the microphone signal based on the combined signal, wherein recognizing speech comprises; increasing a likelihood that speech is recognized by an amount based on a probability that the speech detection signal indicates that the user is speaking; and decreasing a likelihood that speech is recognized by an amount based on a probability that the speech detection signal indicates that the speaker is not speaking. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A speech recognition system, comprising:
-
a speech detection system comprising; an audio microphone outputting a microphone signal based on a sensed audio input; a speech sensor outputting a sensor signal based on a non-audio input generated by speech action; and a speech detector component outputting a speech detection signal indicative of a probability that a user is speaking based on the microphone signal and the sensor signal wherein the speech detector component calculates a combined signal by multiplying the speech detection signal by the microphone signal; and a speech recognition engine recognizing speech to provide a recognition output indicative of speech in the sensed audio input based on the combined signal; increasing a likelihood that speech is recognized by an amount based on a probability that the speech detection signal indicates that the user is speaking; and decreasing a likelihood that speech is recognized by an amount based on a probability that the speech detection signal indicates that the speaker is not speaking. - View Dependent Claims (9)
-
-
10. A method of recognizing speech, comprising:
-
generating a first signal, indicative of an audio input, with an audio microphone; generating a second signal indicative of facial movement of a user, sensed by a facial movement sensor; generating a third signal indicative of a probability that the user is speaking based on the first and second signals; generating a fourth signal by multiplying the probability that the user is speaking by the first signal; and recognizing speech based on the fourth signal and the speech detection signal, wherein recognizing speech comprises; increasing a likelihood that speech is recognized by an amount based on a probability that the speech detection signal indicates that the user is speaking; and decreasing a likelihood that speech is recognized by an amount based on a probability that the speech detection signal indicates that the speaker is not speaking. - View Dependent Claims (11, 12, 13)
-
Specification