Speech classification of audio for wake on voice
First Claim
Patent Images
1. A speech detection system comprising:
- a memory to store received audio input; and
a processor coupled to the memory, the processor to;
generate, via acoustic scoring of an acoustic model based on the received audio input, a plurality of probability scores each for a corresponding audio unit;
update a speech pattern model based on at least some of the probability scores to generate a score for each state of the speech pattern model, wherein the speech pattern model comprises a first non-speech state comprising a plurality of self loops each associated with a non-speech probability score of the probability scores, a plurality of speech states following the first non-speech state, and a second non-speech state following the speech states, wherein the speech states comprise a first speech state following and connected to the first non-speech state by a plurality of first transitions each corresponding to a speech probability score of the probability scores and a second speech state following the first speech state and preceding the second non-speech state;
determine whether the received audio input comprises speech based on a comparison of a first score of the first non-speech state and a second score of the second speech state; and
provide a speech detection indicator when the received audio input comprises speech.
1 Assignment
0 Petitions
Accused Products
Abstract
Speech or non-speech detection techniques are discussed and include updating a speech pattern model using probability scores from an acoustic model to generate a score for each state of the speech pattern model, such that the speech pattern model includes a first non-speech state having multiple self loops each associated with a non-speech probability score of the probability scores, a plurality of speech states following the first non-speech state, and a second non-speech state following the speech states, and detecting speech based on a comparison of a score of the first non-speech state and a score of the last speech state of the multiple speech states.
-
Citations
25 Claims
-
1. A speech detection system comprising:
-
a memory to store received audio input; and a processor coupled to the memory, the processor to; generate, via acoustic scoring of an acoustic model based on the received audio input, a plurality of probability scores each for a corresponding audio unit; update a speech pattern model based on at least some of the probability scores to generate a score for each state of the speech pattern model, wherein the speech pattern model comprises a first non-speech state comprising a plurality of self loops each associated with a non-speech probability score of the probability scores, a plurality of speech states following the first non-speech state, and a second non-speech state following the speech states, wherein the speech states comprise a first speech state following and connected to the first non-speech state by a plurality of first transitions each corresponding to a speech probability score of the probability scores and a second speech state following the first speech state and preceding the second non-speech state; determine whether the received audio input comprises speech based on a comparison of a first score of the first non-speech state and a second score of the second speech state; and provide a speech detection indicator when the received audio input comprises speech. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A computer-implemented method for speech detection comprising:
-
generating, via acoustic scoring of an acoustic model based on received audio input, a plurality of probability scores each for a corresponding audio unit; updating a speech pattern model based on at least some of the probability scores to generate a score for each state of the speech pattern model, wherein the speech pattern model comprises a first non-speech state comprising a plurality of self loops each associated with a non-speech probability score of the probability scores, a plurality of speech states following the first non-speech state, and a second non-speech state following the speech states, wherein the speech states comprise a first speech state following and connected to the first non-speech state by a plurality of first transitions each corresponding to a speech probability score of the probability scores and a second speech state following the first speech state and preceding the second non-speech state; determining whether the received audio input comprises speech based on a comparison of a first score of the first non-speech state and a second score of the second speech state; and providing a speech detection indicator when the received audio input comprises speech. - View Dependent Claims (14, 15, 16, 17, 18, 19)
-
-
20. At least one non-transitory machine readable medium comprising a plurality of instructions that, in response to being executed on a device, cause the device to speech detection by:
-
generating, via acoustic scoring of an acoustic model based on received audio input, a plurality of probability scores each for a corresponding audio unit; updating a speech pattern model based on at least some of the probability scores to generate a score for each state of the speech pattern model, wherein the speech pattern model comprises a first non-speech state comprising a plurality of self loops each associated with a non-speech probability score of the probability scores, a plurality of speech states following the first non-speech state, and a second non-speech state following the speech states, wherein the speech states comprise a first speech state following and connected to the first non-speech state by a plurality of first transitions each corresponding to a speech probability score of the probability scores and a second speech state following the first speech state and preceding the second non-speech state; determining whether the received audio input comprises speech based on a comparison of a first score of the first non-speech state and a second score of the second speech state; and providing a speech detection indicator when the received audio input comprises speech. - View Dependent Claims (21, 22, 23, 24, 25)
-
Specification