×

System and method for transcription of spoken words using multilingual mismatched crowd unfamiliar with a spoken language

  • US 10,269,353 B2
  • Filed: 03/31/2017
  • Issued: 04/23/2019
  • Est. Priority Date: 08/30/2016
  • Status: Active Grant
First Claim
Patent Images

1. A computer implemented method for transcribing one or more spoken word utterances of a source language using a multilingual mismatched crowd unfamiliar with the source language, the method comprises:

  • collecting, at a word transcription table, a plurality of multi-scripted noisy transcriptions of the spoken word obtained from a plurality of workers of the multilingual mismatched crowd, wherein the word transcription table is configured to store transcription responses of the spoken word segments presented to the plurality of workers, audio chunk id, each of the plurality of workers id, and the plurality of workers transcription text;

    mapping each of the collected plurality of multi-scripted transcriptions to a phoneme sequence in the source language using script specific graphemes to phoneme model;

    building worker specific insertion-deletion-substitution (IDS) channel model, multi-scripted transcription script specific IDS channel model and a global IDS channel model from the multi-scripted transcriptions;

    filtering out a set of workers of the plurality of workers based on the reputation of the workers, estimated by simulating IDS channel for worker specific on dictionary words using worker reputation module;

    allocating the transcription tasks to the set of workers such that required number of transcriptions per word are minimized; and

    decoding, at a transcription decoding module, the plurality of multi-scripted transcriptions are combined to decode the transcription in source script, wherein the decoding comprises steps of;

    finding likelihood probability of the mapped phoneme sequences of the multi-scripted mismatched crowd transcriptions with each of the dictionary words phoneme sequence using insertion-deletion-substitution channel parameters and voting the dictionary word that maximizes above likelihood; and

    determining word belief by taking ratio of the likelihood probability of the mapped phoneme sequences of transcriptions given current estimate of word to sum of the likelihood probabilities of mapped phoneme sequences of the transcriptions given the phoneme sequence of each dictionary word.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×