×

Automatic collection of speaker name pronunciations

  • US 9,240,181 B2
  • Filed: 08/20/2013
  • Issued: 01/19/2016
  • Est. Priority Date: 08/20/2013
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising:

  • segmenting an audio stream into a plurality of time segments using speaker segmentation and recognition (SSR), each of the plurality of time segments corresponding to a name label, to produce an SSR transcript;

    transcribing the audio stream into a plurality of word regions using automatic speech recognition (ASR), each of the plurality of word regions having an associated accuracy confidence, to produce an ASR transcript;

    identifying a plurality of low confidence regions from the plurality of word regions, each of the low confidence regions having an associated accuracy confidence below a threshold;

    identifying at least one likely name region from the ASR transcript using named entity recognition (NER) rules, wherein the NER rules analyze word regions to identify the at least one likely name region, and the NER rules associate each of the at least one likely name regions with a name label from the SSR transcript corresponding to one of a current, previous, or subsequent time segment in the SSR transcript;

    filtering the at least one likely name region with the plurality of low confidence regions to determine at least one low confidence name region;

    selecting all of the low confidence name regions associated with a selected name label, the selected name label being selected from the name labels in the SSR transcript;

    decoding a phoneme transcript from the audio stream for each of the selected likely name regions using a phoneme decoder; and

    correlating the selected name label with all of the phoneme transcripts for the selected likely name regions.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×