Text independent speaker recognition for transparent command ambiguity resolution and continuous access control
First Claim
1. A method of text independent speaker recognition comprising the steps of sampling overlapping frames of a speech signal, computing a feature vector for each said frame of said speech signal, comparing each said feature vector with vector parameters and variances stored in a codebook corresponding to an enrolled speaker, accumulating the number of frames for which the corresponding feature vector corresponds to vector parameters and variances in a codebook, and identifying an enrolled speaker or detecting a new speaker in response to results of said accumulating step or said comparing step, respectively.
1 Assignment
0 Petitions
Accused Products
Abstract
Feature vectors representing each of a plurality of overlapping frames of an arbitrary, text independent speech signal are computed and compared to vector parameters and variances stored as codewords in one or more codebooks corresponding to each of one or more enrolled users to provide speaker dependent information for speech recognition and/or ambiguity resolution. Other information such as aliases and preferences of each enrolled user may also be enrolled and stored, for example, in a database. Correspondence of the feature vectors may be ranked by closeness of correspondence to a codeword entry and the number of frames corresponding to each codebook are accumulated or counted to identify a potential enrolled speaker. The differences between the parameters of the feature vectors and codewords in the codebooks can be used to identify a new speaker and an enrollment procedure can be initiated. Continuous authorization and access control can be carried out based on any utterance either by verification of the authorization of a speaker of a recognized command or comparison with authorized commands for the recognized speaker. Text independence also permits coherence checks to be carried out for commands to validate the recognition process.
88 Citations
21 Claims
-
1. A method of text independent speaker recognition comprising the steps of
sampling overlapping frames of a speech signal, computing a feature vector for each said frame of said speech signal, comparing each said feature vector with vector parameters and variances stored in a codebook corresponding to an enrolled speaker, accumulating the number of frames for which the corresponding feature vector corresponds to vector parameters and variances in a codebook, and identifying an enrolled speaker or detecting a new speaker in response to results of said accumulating step or said comparing step, respectively.
-
15. Apparatus including a speech recognition system and a text independent speaker recognition system, said text independent speaker recognition system comprising
means for sampling overlapping frames of a speech signal, means for computing a feature vector for each said frame of said speech signal, means for comparing each said feature vector with vector parameters and variances stored in a codebook corresponding to an enrolled speaker, and means for accumulating the number of frames for which the corresponding feature vector corresponds to vector parameters and variances in a codebook corresponding to an enrolled speaker.
Specification