Speaker dependent voiced sound pattern detection thresholds
First Claim
1. A method of determining a set of detection normalization threshold values associated with speaker dependent voiced sound pattern (VSP) detection, the method comprising:
- converting, at one or more audio sensors, an audible signal into electronic audible signal data;
obtaining, from the electronic audible signal data, a common set of segment templates characterizing a concurrent segmentation of a first subset of a plurality of vocalization instances of a VSP, wherein each segment template provides a stochastic characterization of how a particular portion of the VSP is vocalized by a particular speaker, wherein at least a subset of the first subset of the plurality of vocalization instances are divided into the same number of segments as one other;
synthesizing a noisy segment matrix using a second subset of the plurality of vocalization instances of the VSP, wherein the noisy segment matrix includes one or more noisy copies of segment representations of the second subset of the plurality of vocalization instances of the VSP;
scoring segments from the noisy segment matrix against the common set of segment templates, wherein utilizing the common set of segment templates for scoring the segments reduces resource utilization associated with scoring the segments;
synthesizing detection normalization threshold values at two or more known SNR levels for at least one particular noise type based on a function of the scoring; and
outputting the detection normalization threshold values to a non-transitory memory through an output device.
1 Assignment
0 Petitions
Accused Products
Abstract
Various implementations disclosed herein include a training module configured to determining a set of detection normalization threshold values associated with speaker dependent voiced sound pattern (VSP) detection. In some implementations, a method includes obtaining segment templates characterizing a concurrent segmentation of a first subset of a plurality of vocalization instances of a VSP, each segment template provides a stochastic characterization of how a particular portion of the VSP is vocalized by a particular speaker; generating a noisy segment matrix using a second subset of the plurality of vocalization instances of the VSP, wherein the noisy segment matrix includes one or more noisy copies of segment representations of the second subset; scoring segments from the noisy segment matrix against the segment templates; and determining detection normalization threshold values at two or more known SNR levels for at least one particular noise type based on a function of the scoring.
14 Citations
16 Claims
-
1. A method of determining a set of detection normalization threshold values associated with speaker dependent voiced sound pattern (VSP) detection, the method comprising:
-
converting, at one or more audio sensors, an audible signal into electronic audible signal data; obtaining, from the electronic audible signal data, a common set of segment templates characterizing a concurrent segmentation of a first subset of a plurality of vocalization instances of a VSP, wherein each segment template provides a stochastic characterization of how a particular portion of the VSP is vocalized by a particular speaker, wherein at least a subset of the first subset of the plurality of vocalization instances are divided into the same number of segments as one other; synthesizing a noisy segment matrix using a second subset of the plurality of vocalization instances of the VSP, wherein the noisy segment matrix includes one or more noisy copies of segment representations of the second subset of the plurality of vocalization instances of the VSP; scoring segments from the noisy segment matrix against the common set of segment templates, wherein utilizing the common set of segment templates for scoring the segments reduces resource utilization associated with scoring the segments; synthesizing detection normalization threshold values at two or more known SNR levels for at least one particular noise type based on a function of the scoring; and outputting the detection normalization threshold values to a non-transitory memory through an output device. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A system provided to determine a set of detection normalization threshold values associated with speaker dependent voiced sound pattern (VSP) detection, the system comprising:
-
one or more audio sensors configured to convert an audible signal into electronic audible signal data; a processor; and a non-transitory memory including instructions which, when executed by the processor, cause the system to; synthesize, based on the electronic audible signal data, match probabilities as a function of one or more statistical similarity characterizations between noisy copies of segment representations and common segment templates, wherein the common segment templates characterize a concurrent segmentation of a first subset of a plurality of vocalization instances of a VSP, wherein at least a subset of the first subset of the plurality of vocalization instances are divided into the same number of segments as one other, and each of the segment representations are associated with a second subset of the plurality of vocalization instances of the VSP, wherein utilizing the common segment templates for synthesizing the match probabilities reduces resource utilization associated with synthesizing the match probabilities; synthesize unbiased scores from raw score match probabilities at a number of (signal-to-noise) SNR levels of at least one particular noise type; synthesize detection normalization threshold values at two or more known SNR levels for at least one particular noise type based on the unbiased scores; and output the detection normalization threshold values to the non-transitory memory through an output device.
-
Specification