Speaker recognition in the call center
First Claim
Patent Images
1. A computer-implemented method comprising:
- extracting, by a computer, a first set of audio features from a first audio signal containing a first utterance according to a speaker diarization algorithm, the first audio signal received from an enrollee electronic device;
extracting, by the computer, a second set of audio features from a second audio signal containing a second utterance according to the speaker diarization algorithm, the second audio signal received from a caller electronic device;
performing, by the computer, a phonetic and acoustic comparison of the first utterance and the second utterance based upon the first set of audio features and the second set of audio features; and
determining, by the computer based upon the phonetic and acoustic comparison, at least a partial keyword sequence match and a speaker match between the first utterance and the second utterance.
2 Assignments
0 Petitions
Accused Products
Abstract
Utterances of at least two speakers in a speech signal may be distinguished and the associated speaker identified by use of diarization together with automatic speech recognition of identifying words and phrases commonly in the speech signal. The diarization process clusters turns of the conversation while recognized special form phrases and entity names identify the speakers. A trained probabilistic model deduces which entity name(s) correspond to the clusters.
90 Citations
20 Claims
-
1. A computer-implemented method comprising:
-
extracting, by a computer, a first set of audio features from a first audio signal containing a first utterance according to a speaker diarization algorithm, the first audio signal received from an enrollee electronic device; extracting, by the computer, a second set of audio features from a second audio signal containing a second utterance according to the speaker diarization algorithm, the second audio signal received from a caller electronic device; performing, by the computer, a phonetic and acoustic comparison of the first utterance and the second utterance based upon the first set of audio features and the second set of audio features; and determining, by the computer based upon the phonetic and acoustic comparison, at least a partial keyword sequence match and a speaker match between the first utterance and the second utterance. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system comprising:
-
a non-transitory storage medium storing a plurality of computer program instructions; and a processor electrically coupled to the non-transitory storage medium and configured to execute the computer program instructions to; extract a first set of audio features from a first audio signal containing a first utterance according to a speaker diarization algorithm, the first audio signal received from an enrollee electronic device; extract a second set of audio features from a second audio signal containing a second utterance according to the speaker diarization algorithm, the second audio signal received from a caller electronic device; perform a phonetic and acoustic comparison of the first utterance and the second utterance based upon the first set of audio features and the second set of audio features; and determine based upon the phonetic and acoustic comparison, at least a partial keyword sequence match and a speaker match between the first utterance and the second utterance. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification