Dynamic security code speech-based identity authentication system and method having self-learning function
First Claim
1. A dynamic security code speech-based identity authentication system having self-learning function, comprising:
- a request receiving module for receiving an identity authentication request that a requester sends to a server through a client;
a dynamic security code generating module for generating a dynamic security code and sending the dynamic security code to the client; and
an identity authentication module for calculating a comprehensive confidence of an identity of the requester by using an acoustic model of global characters and a voiceprint model of a user based on a security code speech signal sent from the client, wherein the security code speech signal is generated when the requester reads out the dynamic security code;
judging the identity of the requester based on the calculated comprehensive confidence of the identity; and
feeding an identity authentication result back to the client,whereinthe dynamic security code speech-based identity authentication system is provided with an automatic reconstruction subsystem for the voiceprint model, and the voiceprint model of the user is reconstructed by the automatic reconstruction subsystem for the voiceprint model when the identity authentication result is that the requester is the user of the server, andthe automatic reconstruction subsystem for the voiceprint model comprises;
a time-varying data storage unit for storing speech data of each user with time labels;
a time-varying data updating module for storing the security code speech signal as a latest speech data into the time-varying data storage unit;
a time window channel construction module for extracting the speech data of the user from the time-varying data storage unit in an order of the time labels, constructing a time window channel including a plurality of sets of speech data, and updating the speech data included in the time window channel using the latest speech data; and
a voiceprint model reconstruction module for reconstructing the voiceprint model of the user for the user using the plurality of sets of speech data included in the updated time window channel,wherein,the automatic reconstruction subsystem for the voiceprint model further comprises a parameterization module for speech data, and the parameterization module for speech data is used for parameterizing the security code speech signal, i.e., speech data, to obtain a latest parameterized speech data;
parameterized speech data of each user is stored with time labels in the time-varying data storage unit;
the latest parameterized speech data is stored in the time-varying data storage unit by the time-varying data updating module;
the time window channel construction module extracts parameterized speech data of the user from the time-varying data storage unit in the order of the time labels, constructsa time window channel including a plurality of sets of parameterized speech data, and updates the parameterized speech data included in the time window channel using thelatest parameterized speech data; and
the voiceprint model reconstruction module reconstructs the voiceprint model of the user for the user using the plurality of sets of parameterized speech data included in the updated time window channel,the automatic reconstruction subsystem for the voiceprint model further comprises a speech recognition module for recognizing phonemes corresponding to respective frames in the speech data;
phonemes corresponding to the latest parameterized speech data and frame intervals corresponding to the phonemes are further stored in the time-varying data storage unit;
andthe time window channel construction module updates the parameterized speech data included in the time window channel based on the phonemes corresponding to the latest parameterized speech data, so that phonemes corresponding to the plurality of sets of parameterized speech data included in the time window channel are evenly distributed, and,the time window channel construction module tentatively removes a set of parameterized speech data from the time window channel sequentially in the order of the time labels from old to new, and calculates an equilibrium degree of a character-based phoneme distribution based on all of parameterized speech data remained in the time window channel and the latest parameterized speech data, and if the equilibrium degree is greater than or equal to a predetermined threshold of the equilibrium degree, the latest parameterized speech data is pushed into the time window channel;
otherwise, the set of parameterized speech data tentatively removed is restored to the time window channel, and then a next set of parameterized speech data is tentatively removed from the time window channel, and once again, the time window channel construction module calculates the equilibrium degree based on all of parameterized speech data remained in the time window channel and the latest parameterized speech data till each set of parameterized speech data included in the time window channel has been tentatively removed or the latest parameterized speech data has been pushed into the time window channel.
1 Assignment
0 Petitions
Accused Products
Abstract
Dynamic security code speech-based identity authentication system and method having self-learning function, being equipped with: a time-varying data storage unit for storing speech data of each user with time labels; a time-varying data update module (23) for storing the latest speech data into the time-varying data storage unit; a time window channel construction module (24) for extracting speech data from the time-varying data storage unit in the order of the time labels, constructing and updating the time window channel comprising multiple sets of speech data; a voiceprint model reconstruction module (25) reconstructing the user voiceprint model by using the multiple sets of speech data comprised in the updated time window channel.
17 Citations
14 Claims
-
1. A dynamic security code speech-based identity authentication system having self-learning function, comprising:
-
a request receiving module for receiving an identity authentication request that a requester sends to a server through a client; a dynamic security code generating module for generating a dynamic security code and sending the dynamic security code to the client; and an identity authentication module for calculating a comprehensive confidence of an identity of the requester by using an acoustic model of global characters and a voiceprint model of a user based on a security code speech signal sent from the client, wherein the security code speech signal is generated when the requester reads out the dynamic security code;
judging the identity of the requester based on the calculated comprehensive confidence of the identity; and
feeding an identity authentication result back to the client,wherein the dynamic security code speech-based identity authentication system is provided with an automatic reconstruction subsystem for the voiceprint model, and the voiceprint model of the user is reconstructed by the automatic reconstruction subsystem for the voiceprint model when the identity authentication result is that the requester is the user of the server, and the automatic reconstruction subsystem for the voiceprint model comprises; a time-varying data storage unit for storing speech data of each user with time labels; a time-varying data updating module for storing the security code speech signal as a latest speech data into the time-varying data storage unit; a time window channel construction module for extracting the speech data of the user from the time-varying data storage unit in an order of the time labels, constructing a time window channel including a plurality of sets of speech data, and updating the speech data included in the time window channel using the latest speech data; and a voiceprint model reconstruction module for reconstructing the voiceprint model of the user for the user using the plurality of sets of speech data included in the updated time window channel, wherein, the automatic reconstruction subsystem for the voiceprint model further comprises a parameterization module for speech data, and the parameterization module for speech data is used for parameterizing the security code speech signal, i.e., speech data, to obtain a latest parameterized speech data; parameterized speech data of each user is stored with time labels in the time-varying data storage unit; the latest parameterized speech data is stored in the time-varying data storage unit by the time-varying data updating module; the time window channel construction module extracts parameterized speech data of the user from the time-varying data storage unit in the order of the time labels, constructs a time window channel including a plurality of sets of parameterized speech data, and updates the parameterized speech data included in the time window channel using the latest parameterized speech data; and the voiceprint model reconstruction module reconstructs the voiceprint model of the user for the user using the plurality of sets of parameterized speech data included in the updated time window channel, the automatic reconstruction subsystem for the voiceprint model further comprises a speech recognition module for recognizing phonemes corresponding to respective frames in the speech data; phonemes corresponding to the latest parameterized speech data and frame intervals corresponding to the phonemes are further stored in the time-varying data storage unit; and the time window channel construction module updates the parameterized speech data included in the time window channel based on the phonemes corresponding to the latest parameterized speech data, so that phonemes corresponding to the plurality of sets of parameterized speech data included in the time window channel are evenly distributed, and, the time window channel construction module tentatively removes a set of parameterized speech data from the time window channel sequentially in the order of the time labels from old to new, and calculates an equilibrium degree of a character-based phoneme distribution based on all of parameterized speech data remained in the time window channel and the latest parameterized speech data, and if the equilibrium degree is greater than or equal to a predetermined threshold of the equilibrium degree, the latest parameterized speech data is pushed into the time window channel;
otherwise, the set of parameterized speech data tentatively removed is restored to the time window channel, and then a next set of parameterized speech data is tentatively removed from the time window channel, and once again, the time window channel construction module calculates the equilibrium degree based on all of parameterized speech data remained in the time window channel and the latest parameterized speech data till each set of parameterized speech data included in the time window channel has been tentatively removed or the latest parameterized speech data has been pushed into the time window channel. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A dynamic security code speech-based identity authentication method having self-learning function, comprising following steps:
-
a request receiving step for receiving an identity authentication request that a requester sends to a server through a client; a dynamic security code generating step for generating a dynamic security code and sending the dynamic security code to the client; and an identity authentication step for calculating a comprehensive confidence of an identity of the requester by using an acoustic model of global characters and a voiceprint model of a user based on a security code speech signal sent from the client, wherein the security code speech signal is generated when the requester reads out the dynamic security code;
judging the identity of the requester based on the calculated comprehensive confidence of the identity; and
feeding an identity authentication result back to the client,wherein when the identity authentication result is that the requester is the user of the server, following steps are further performed; a time-varying data storing step for storing the security code speech signal as a latest speech data into a time-varying data storage unit in which speech data of each user is stored with time labels; a time window channel construction step for extracting speech data of the user from the time-varying data storage unit in an order of the time labels, constructing a time window channel including a plurality of sets of speech data, and updating the speech data included in the time window channel using the latest speech data; and a model reconstruction step for reconstructing the voiceprint model of the user for the user using the plurality of sets of speech data included in the updated time window channel; a parameterization step, and in the parameterization step, parameterizing the security code speech signal, i.e., speech data, to obtain a latest parameterized speech data; in the time-varying data storing step, storing the latest parameterized speech data into the time-varying data storage unit in which parameterized speech data of each user is stored with time labels; in the time window channel construction step, extracting parameterized speech data of the user from the time-varying data storage unit in the order of the time labels, constructing a time window channel including a plurality of sets of parameterized speech data, and updating the parameterized speech data included in the time window channel using the latest parameterized speech data; and in the model reconstruction step, reconstructing the voiceprint model of the user for the user using the plurality of sets of parameterized speech data included in the updated time window channel; a speech recognition step for recognizing phonemes corresponding to respective frames in the speech data; in the time-varying data storing step, further storing phonemes corresponding to the latest parameterized speech data and frame intervals corresponding to the phonemes into the time-varying data storage unit; and in the time window channel construction step, updating the parameterized speech data included in the time window channel based on the phonemes corresponding to the latest parameterized speech data, so that phonemes corresponding to the plurality of sets of parameterized speech data included in the time window channel are evenly distributed; wherein, in the time window channel construction step, tentatively removing a set of parameterized speech data from the time window channel sequentially in the order of the time labels from old to new, and calculating an equilibrium degree of a character-based phoneme distribution based on all of parameterized speech data remained in the time window channel and the latest parameterized speech data, and if the equilibrium degree is greater than or equal to a predetermined threshold of the equilibrium degree, pushing the latest parameterized speech data into the time window channel;
otherwise, restoring the set of parameterized speech data tentatively removed to the time window channel, andthen tentatively removing a next set of parameterized speech data from the time window channel, and once again, calculating the equilibrium degree based on all of parameterized speech data remained in the time window channel and the latest parameterized speech data till each set of parameterized speech data included in the time window channel has been tentatively removed or the latest parameterized speech data has been pushed into the time window channel. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
Specification