Combining results from first and second speaker recognition processes
First Claim
Patent Images
1. A method of processing a received signal representing a user'"'"'s speech, the method comprising:
- performing a first speaker recognition process on a first portion of the received signal, to obtain a first output result;
performing a second speaker recognition process on a second portion of the received signal that is different from the first portion of the received signal, to obtain a second output result, wherein the second speaker recognition process is different from the first speaker recognition process;
applying respective weighting values to the first and second output results to form first and second weighted results respectively;
combining the first and second weighted results to obtain a combined output result indicating a likelihood that the user is a registered user; and
performing an antispoofing process on at least one of the first and second portions of the received signal to obtain an antispoofing score;
wherein the weighting value applied to the second output result is determined by;
excluding fragments of the second portion of the received signal that do not contain speech, and determining a total length of fragments of the second portion of the received signal that do contain speech; and
setting the weighting value applied to the second output result based on the total length of fragments of the second portion of the received signal that do contain speech; and
wherein at least one of the respective weighting values applied to the first and second output results is based on the respective antispoofing score obtained from the respective portion of the received signal.
2 Assignments
0 Petitions
Accused Products
Abstract
A received signal represents a user'"'"'s speech. A first speaker recognition process is performed on a first portion of the received signal, to obtain a first output result. A second speaker recognition process is performed on a second portion of the received signal that is different from the first portion of the received signal, to obtain a second output result. The second speaker recognition process is different from the first speaker recognition process. The first and second output results are combined to obtain a combined output result indicating a likelihood that the user is a registered user.
-
Citations
18 Claims
-
1. A method of processing a received signal representing a user'"'"'s speech, the method comprising:
-
performing a first speaker recognition process on a first portion of the received signal, to obtain a first output result; performing a second speaker recognition process on a second portion of the received signal that is different from the first portion of the received signal, to obtain a second output result, wherein the second speaker recognition process is different from the first speaker recognition process; applying respective weighting values to the first and second output results to form first and second weighted results respectively; combining the first and second weighted results to obtain a combined output result indicating a likelihood that the user is a registered user; and performing an antispoofing process on at least one of the first and second portions of the received signal to obtain an antispoofing score; wherein the weighting value applied to the second output result is determined by; excluding fragments of the second portion of the received signal that do not contain speech, and determining a total length of fragments of the second portion of the received signal that do contain speech; and setting the weighting value applied to the second output result based on the total length of fragments of the second portion of the received signal that do contain speech; and wherein at least one of the respective weighting values applied to the first and second output results is based on the respective antispoofing score obtained from the respective portion of the received signal. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A device for processing a received signal representing a user'"'"'s speech, for performing speaker recognition, wherein the device is configured to:
-
perform a first speaker recognition process on a first portion of the received signal, to obtain a first output result; perform a second speaker recognition process on a second portion of the received signal that is different from the first portion of the received signal, to obtain a second output result, wherein the second speaker recognition process is different from the first speaker recognition process; apply respective weighting values to the first and second output results to form first and second weighted results respectively; combine the first and second weighted results to obtain a combined output result indicating a likelihood that the user is a registered user; and perform an antispoofing process on at least one of the first and second portions of the received signal to obtain an antispoofing score; wherein the weighting value applied to the second output result is determined by; excluding fragments of the second portion of the received signal that do not contain speech, and determining a total length of fragments of the second portion of the received signal that do contain speech; and setting the weighting value applied to the second output result based on the total length of fragments of the second portion of the received signal that do contain speech; and wherein at least one of the respective weighting values applied to the first and second output results is based on the respective antispoofing score obtained from the respective portion of the received signal. - View Dependent Claims (15)
-
-
16. An integrated circuit device for processing a received signal representing a user'"'"'s speech, for performing speaker recognition, wherein the integrated circuit device is configured to:
-
perform a first speaker recognition process on a first portion of the received signal, to obtain a first output result; perform a second speaker recognition process on a second portion of the received signal that is different from the first portion of the received signal, to obtain a second output result, wherein the second speaker recognition process is different from the first speaker recognition process; apply respective weighting values to the first and second output results to form first and second weighted results respectively; combine the first and second weighted results to obtain a combined output result indicating a likelihood that the user is a registered user; and perform an antispoofing process on at least one of the first and second portions of the received signal to obtain an antispoofing score; wherein the weighting value applied to the second output result is determined by; excluding fragments of the second portion of the received signal that do not contain speech, and determining a total length of fragments of the second portion of the received signal that do contain speech; and setting the weighting value applied to the second output result based on the total length of fragments of the second portion of the received signal that do contain speech; and wherein at least one of the respective weighting values applied to the first and second output results is based on the respective antispoofing score obtained from the respective portion of the received signal. - View Dependent Claims (17)
-
-
18. A non-transitory computer readable storage medium having computer-executable instructions stored thereon that, when executed by processor circuitry, cause the processor circuitry to perform a method comprising:
-
performing a first speaker recognition process on a first portion of the received signal, to obtain a first output result; performing a second speaker recognition process on a second portion of the received signal that is different from the first portion of the received signal, to obtain a second output result, wherein the second speaker recognition process is different from the first speaker recognition process; apply respective weighting values to the first and second output results to form first and second weighted results respectively; combining the first and second weighted results to obtain a combined output result indicating a likelihood that the user is a registered user; and performing an antispoofing process on at least one of the first and second portions of the received signal to obtain an antispoofing score; wherein the weighting value applied to the second output result is determined by; excluding fragments of the second portion of the received signal that do not contain speech, and determining a total length of fragments of the second portion of the received signal that do contain speech; and setting the weighting value applied to the second output result based on the total length of fragments of the second portion of the received signal that do contain speech; and wherein at least one of the respective weighting values applied to the first and second output results is based on the respective antispoofing score obtained from the respective portion of the received signal.
-
Specification