×

Diarization using linguistic labeling

  • US 10,134,401 B2
  • Filed: 11/20/2013
  • Issued: 11/20/2018
  • Est. Priority Date: 11/21/2012
  • Status: Active Grant
First Claim
Patent Images

1. A method of diarization, the method comprising:

  • receiving a set of textual transcripts from a transcription server and a set of audio files associated with the set of textual transcripts from an audio database server;

    performing a blind diarization on the set of textual transcripts and the set of audio files to segment and cluster the textual transcripts into a plurality of textual speaker clusters, wherein the number of textual speaker clusters is at least equal to a number of speakers in the textual transcript;

    automatedly applying at least one heuristic to the textual speaker clusters with a processor to select textual speaker clusters likely to be associated with an identified group of speakers;

    analyzing the selected textual speaker clusters with the processor to create at least one linguistic model;

    applying the linguistic model to transcripted audio data with the processor to label a portion of the transcripted audio data as having been spoken by the identified group of speakers;

    determining word use frequencies for words in the selected transcripts with the processor, wherein the word use frequencies are used to create the at least one linguistic model;

    determining word use frequencies for words in the diarized portions of non-selected transcripts with the processor; and

    comparing the word use frequencies for words in the selected transcripts to the word use frequencies for words in the non-selected transcripts with the processor to identify a plurality of discriminating words for use in the at least one linguistic model,wherein the diarized textual transcripts are associated in groups of at least two, wherein the group of at least two includes a textual transcript originating from the identified group of speakers and at least one textual transcript originating from an other speaker, and wherein the non-selected transcripts are assumed to have originated from an other speaker,further wherein the at least one heuristic is a detection of a script associated with the identified group of speakers,further wherein a plurality of scripts associated with the identified group of speakers is compared to each of the diarized transcripts and a correlation score between each of the diarized transcripts and the plurality of scripts is calculated and further wherein the diarized transcript in each group with the greatest correlation score is selected as being the transcript likely to be associated with the identified group of speakers; and

    saving the at least one linguistic model to a linguistic database server and associating it with the labeled speaker;

    with the processor, applying the saved at least one linguistic model from the linguistic database server to a new audio file transcript from an audio source to perform diarization of the new audio file by blind diarizing the new audio file, comparing each new textual speaker cluster to the at least one linguistic model, and labeling each textual speaker cluster as belonging to a customer service agent or belonging to a customer.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×