×

Training acoustic models using connectionist temporal classification

  • US 10,229,672 B1
  • Filed: 01/03/2017
  • Issued: 03/12/2019
  • Est. Priority Date: 12/31/2015
  • Status: Active Grant
First Claim
Patent Images

1. A method performed by one or more computers of a speech recognition system, the method comprising:

  • training, by the one or more computers of the speech recognition system, a first connectionist temporal classification (CTC) acoustic model on first training data to generate, as unmodified outputs, second training data of context-dependent state inventory from approximate phonetic alignments, the first training data comprising context-independent phones generated without using any previously determined phonetic alignments;

    training, by the one or more computers of the speech recognition system, a second CTC acoustic model on the second training data to generate outputs corresponding to one or more context-dependent states;

    accessing, by the one or more computers of the speech recognition system, the second CTC acoustic model;

    receiving, by the one or more computers of the speech recognition system, audio data for a portion of an utterance;

    providing, by the one or more computers of the speech recognition system, input data corresponding to the received audio data as input to the accessed second CTC acoustic model that has been trained on the second training data;

    generating, by the one or more computers of the speech recognition system, data indicating a transcription for the utterance based on output that the accessed second CTC acoustic model produced in response to the input data corresponding to the received audio data; and

    providing, by the one or more computers of the speech recognition system, the data indicating the transcription as output of the automated speech recognition system.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×