Anti-spoofing
First Claim
Patent Images
1. A speaker recognition system adapted for receiving audio data, the system being adapted for:
- receiving audio data under test;
obtaining a Medium Frequency Relative Energy (MF) parameter, comprising a ratio between an energy of the received audio data under test in a predetermined frequency band and an energy of a complete frequency spectrum of the received audio data under test; and
classifying using a Gaussian classifier whether the received audio data under test is genuine or represents a recording replayed through a loudspeaker, based on the Medium Frequency Relative Energy (MF) parameter, wherein the Gaussian classifier is trained by the following steps;
a first Gaussian is obtained by;
receiving genuine audio data;
obtaining a first Medium Frequency Relative Energy (MF) parameter, comprising the ratio between the energy of the genuine audio data in a predetermined frequency band and the energy of the complete frequency spectrum of the genuine audio data;
receiving audio data representing recordings replayed through a loudspeaker; and
modelling the genuine audio data;
and wherein a second Gaussian is obtained by;
receiving audio data representing recordings replayed through a loudspeaker;
obtaining a second Medium Frequency Relative Energy (MF) parameter, comprising the ratio between the energy of the audio data representing recordings replayed through a loudspeaker in a predetermined frequency band and the energy of the complete frequency spectrum of the audio data representing recordings replayed through a loudspeaker; and
modelling the audio data representing recordings replayed through a loudspeaker with a second Gaussian.
3 Assignments
0 Petitions
Accused Products
Abstract
System for classifying whether audio data received in a speaker recognition system is genuine or a spoof using a Gaussian classifier and method for classifying whether audio data received in a speaker recognition system is genuine or a spoof using a Gaussian classifier.
-
Citations
30 Claims
-
1. A speaker recognition system adapted for receiving audio data, the system being adapted for:
-
receiving audio data under test; obtaining a Medium Frequency Relative Energy (MF) parameter, comprising a ratio between an energy of the received audio data under test in a predetermined frequency band and an energy of a complete frequency spectrum of the received audio data under test; and classifying using a Gaussian classifier whether the received audio data under test is genuine or represents a recording replayed through a loudspeaker, based on the Medium Frequency Relative Energy (MF) parameter, wherein the Gaussian classifier is trained by the following steps; a first Gaussian is obtained by; receiving genuine audio data; obtaining a first Medium Frequency Relative Energy (MF) parameter, comprising the ratio between the energy of the genuine audio data in a predetermined frequency band and the energy of the complete frequency spectrum of the genuine audio data; receiving audio data representing recordings replayed through a loudspeaker; and modelling the genuine audio data; and wherein a second Gaussian is obtained by; receiving audio data representing recordings replayed through a loudspeaker; obtaining a second Medium Frequency Relative Energy (MF) parameter, comprising the ratio between the energy of the audio data representing recordings replayed through a loudspeaker in a predetermined frequency band and the energy of the complete frequency spectrum of the audio data representing recordings replayed through a loudspeaker; and modelling the audio data representing recordings replayed through a loudspeaker with a second Gaussian. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method in a speaker recognition system for classifying whether audio data is genuine or represents a recording replayed through a loudspeaker, the method comprising:
-
receiving the audio data, and classifying using a Gaussian classifier whether the received audio data is genuine or represents a recording replayed through a loudspeaker, wherein Gaussians are used to model a region of audio data parameters from genuine audio data and wherein Gaussians are used to model a region of audio data parameters from audio data representing recordings replayed through loudspeakers, based on a Medium Frequency Relative Energy (MF) parameter, and wherein; the Medium Frequency Relative Energy (MF) parameter comprises a ratio between an energy of the audio data in a predetermined frequency band and an energy of a complete frequency spectrum of the audio data; and the Gaussian classifier is trained by the following steps; a first Gaussian is obtained by; receiving genuine audio data; obtaining a first Medium Frequency Relative Energy (MF) parameter, comprising the ratio between the energy of the genuine audio data in a predetermined frequency band and the energy of the complete frequency spectrum of the genuine audio data; receiving audio data representing recordings replayed through a loudspeaker; and modelling the genuine audio data; and wherein a second Gaussian is obtained by; receiving audio data representing recordings replayed through a loudspeaker; obtaining a second Medium Frequency Relative Energy (MF) parameter, comprising the ratio between the energy of the audio data representing recordings replayed through a loudspeaker in a predetermined frequency band and the energy of the complete frequency spectrum of the audio data representing recordings replayed through a loudspeaker; and modelling the audio data representing recordings replayed through a loudspeaker with a second Gaussian. - View Dependent Claims (15)
-
-
16. A speaker recognition system adapted for receiving audio data, the system being adapted for:
-
receiving audio data under test; obtaining a Low Frequency Mel Frequency Cepstral Coefficients (LF-MFCC) parameter, comprising a ratio between an energy of the received audio data under test in a predetermined frequency band and an energy of a complete frequency spectrum of the received audio data under test; and classifying using a Gaussian classifier whether the received audio data under test is genuine or represents a recording replayed through a loudspeaker, based on the Low Frequency Mel Frequency Cepstral Coefficients (LF-MFCC) parameter, wherein the Gaussian classifier is trained by the following steps; a first Gaussian is obtained by; receiving genuine audio data; obtaining a first Low Frequency Mel Frequency Cepstral Coefficients (LF-MFCC) parameter, comprising the ratio between the energy of the genuine audio data in a predetermined frequency band and the energy of the complete frequency spectrum of the genuine audio data; receiving audio data representing recordings replayed through a loudspeaker; and modelling the genuine audio data; and wherein a second Gaussian is obtained by; receiving audio data representing recordings replayed through a loudspeaker; obtaining a Low Frequency Mel Frequency Cepstral Coefficients (LF-MFCC) parameter, comprising the ratio between the energy of the audio data representing recordings replayed through a loudspeaker in a predetermined frequency band and the energy of the complete frequency spectrum of the audio data representing recordings replayed through a loudspeaker; and modelling the audio data representing recordings replayed through a loudspeaker with a second Gaussian. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 30)
-
-
28. A method in a speaker recognition system for classifying whether audio data is genuine or represents a recording replayed through a loudspeaker, the method comprising:
-
receiving the audio data, and classifying using a Gaussian classifier whether the received audio data is genuine or represents a recording replayed through a loudspeaker, wherein Gaussians are used to model a region of audio data parameters from genuine audio data and wherein Gaussians are used to model a region of audio data parameters from audio data representing recordings replayed through loudspeakers, based on the Low Frequency Mel Frequency Cepstral Coefficients (LF-MFCC) parameter, and wherein; the Low Frequency Mel Frequency Cepstral Coefficients (LF-MFCC) parameter comprises 1, 2, 3 or more or all LF-MFCC extracted from a region of the audio data having frequencies lower than a predetermined cut-off frequency; and
wherein the Gaussian classifier is trained by the following steps;a first Gaussian is obtained by; receiving genuine audio data; obtaining a first Low Frequency Mel Frequency Cepstral Coefficients (LF-MFCC) parameter, comprising the ratio between the energy of the genuine audio data in a predetermined frequency band and the energy of the complete frequency spectrum of the genuine audio data; receiving audio data representing recordings replayed through a loudspeaker; and modelling the genuine audio data; and wherein a second Gaussian is obtained by; receiving audio data representing recordings replayed through a loudspeaker; obtaining a Low Frequency Mel Frequency Cepstral Coefficients (LF-MFCC) parameter, comprising the ratio between the energy of the audio data representing recordings replayed through a loudspeaker in a predetermined frequency band and the energy of the complete frequency spectrum of the audio data representing recordings replayed through a loudspeaker; and modelling the audio data representing recordings replayed through a loudspeaker with a second Gaussian. - View Dependent Claims (29)
-
Specification