System and method for detecting a recorded voice
First Claim
1. A method for verifying that a detected speech sample is live and not a recording comprising:
- detecting a first speech sample from a known person;
extracting temporal characteristics from the first speech sample to create a first characteristics set;
detecting a second speech sample;
extracting the temporal characteristics from the second speech sample to create a second characteristics set;
comparing the first and second characteristics sets and denying verification if the first and second characteristics sets do not match within a preset tolerance, wherein the denial of verification indicates that the second speech sample is a recording creating a voice print from the first sample;
comparing the second sample to the voice print and determining that the known person generated the second sample if the second sample matches the voice print within a preset tolerance detecting a third speech sample;
extracting the temporal characteristics from the third speech sample to create a third characteristics set;
comparing the third characteristics set to each of the first and second characteristics sets; and
denying verification if the third characteristics set do not match either of the first or second characteristics sets within a preset tolerance, wherein the denial of verification indicates that the third speech sample is a recording producing a first comparison score during the step of comparing the second sample to the voice print;
producing a second comparison score during the step of comparing the third sample to the voice print;
comparing the first and second comparison scores and denying verification if the first and second comparison scores match within a preset tolerance, wherein the denial of verification indicates that the third voice sample is a recording.
3 Assignments
0 Petitions
Accused Products
Abstract
The present invention is a reliable system and method for detecting a recorded voice, which can be employed independently or to provide protection from fraudulent use of a recording to defeat an automatic speaker recognition system. Several techniques and systems are employed either independently or in combination to verify that a detected audio sample is live and not recorded. Temporal speech characteristics of an audio sample are analyzed to determine whether a sample under examination is so similar to a previous sample as to indicate a recording. Communications channel characteristics are examined to determine whether an sample was recorded on a different channel from a predetermined communications channel. A pattern classifier is trained to distinguish between live and recorded speech. Finally, an “audio watermark” is used to determine whether a detected audio sample is a recording of a previous communication by an authorized user. In addition, the various techniques of the present invention may be employed in serial or parallel combination with a variety of decisionmaking schemes to provide increased performance.
342 Citations
6 Claims
-
1. A method for verifying that a detected speech sample is live and not a recording comprising:
-
detecting a first speech sample from a known person;
extracting temporal characteristics from the first speech sample to create a first characteristics set;
detecting a second speech sample;
extracting the temporal characteristics from the second speech sample to create a second characteristics set;
comparing the first and second characteristics sets and denying verification if the first and second characteristics sets do not match within a preset tolerance, wherein the denial of verification indicates that the second speech sample is a recording creating a voice print from the first sample;
comparing the second sample to the voice print and determining that the known person generated the second sample if the second sample matches the voice print within a preset tolerance detecting a third speech sample;
extracting the temporal characteristics from the third speech sample to create a third characteristics set;
comparing the third characteristics set to each of the first and second characteristics sets; and
denying verification if the third characteristics set do not match either of the first or second characteristics sets within a preset tolerance, wherein the denial of verification indicates that the third speech sample is a recording producing a first comparison score during the step of comparing the second sample to the voice print;
producing a second comparison score during the step of comparing the third sample to the voice print;
comparing the first and second comparison scores and denying verification if the first and second comparison scores match within a preset tolerance, wherein the denial of verification indicates that the third voice sample is a recording.
-
-
2. A method for verifying that a detected audio sample is live and not a recording, comprising:
-
detecting a first audio sample over a prescribed channel during a first communications session;
extracting channel characteristics from the first sample to create a first characteristics set;
detecting a second audio sample during a second communications session;
extracting the channel characteristics from the second sample to create a second characteristics set; and
denying verification if the first and second characteristics sets do not match within a preset tolerance, wherein the denial of verification indicates that the second audio sample is a recording.
-
-
3. A method for verifying that a detected audio sample is live and not a recording, comprising:
-
detecting a first audio sample;
extracting channel characteristics from the first sample to create a first characteristics set;
detecting a second audio sample;
extracting the channel characteristics from the second sample to create a second characteristics set;
denying verification if the first and second characteristics sets do not match within a preset tolerance, wherein the denial of verification indicates that the second audio sample is a recording detecting a third audio sample;
extracting the channel characteristics from the third audio sample to create a third characteristics set;
comparing the third characteristics set to each of the first and second characteristics sets; and
denying verification if the third characteristics set does not match both of the first and second characteristics sets within a preset tolerance. - View Dependent Claims (4)
-
-
5. A system for verifying that a detected audio sample is live and not a recording comprising:
-
a detector for detecting a first audio sample from a known source, a second audio sample and a third audio sample;
an extraction module, operatively connected to the detector, adapted to extract a first set of channel characteristics from the first audio sample, a second set of channel characteristics from the second audio sample; and
a third set of channel characteristics from the third audio sample;
a computer processor, operatively connected to the extraction module, adapted to compare the first and second sets of channel characteristics and to deny verification if the first and second sets of channel characteristics do not match within a preset tolerance, wherein the denial of verification indicates that the second audio sample is a recording and the computer processor is adapted to compare the third set of channel characteristics to each of the first and second sets of channel characteristics and to deny verification if the third set of characteristics fails to match either of the first or second sets of characteristics within a preset tolerance, wherein the denial of verification indicates that the third audio sample is a recording.
-
-
6. A system for verifying that a detected audio sample is live and not a recording, comprising:
-
a detector for detecting a first audio sample from a known source during a first communications session and for detecting a second audio sample during a second communications session;
an extraction module, operatively connected to the detector, adapted to extract a first set of channel characteristics from the first audio sample and to extract a second set of channel characteristics from the second audio sample; and
a computer processor, operatively connected to the extraction module, adapted to compare the first and second sets of channel characteristics and to deny verification if the first and second sets of channel characteristics do not match within a preset tolerance, wherein the denial of verification indicates that the second audio sample is a recording.
-
Specification