×

Processing multi-channel audio waveforms

  • US 9,697,826 B2
  • Filed: 07/08/2016
  • Issued: 07/04/2017
  • Est. Priority Date: 03/27/2015
  • Status: Active Grant
First Claim
Patent Images

1. A system comprising:

  • one or more computers and one or more data storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising;

    receiving multiple channels of audio data corresponding to an utterance;

    convolving each of multiple filters, in a time domain, with each of the multiple channels of audio waveform data to generate convolution outputs, wherein the multiple filters have parameters that have been learned during a training process that jointly trains the multiple filters and trains a deep neural network as an acoustic model;

    combining, for each of the multiple filters, the convolution outputs for the filter for the multiple channels of audio waveform data;

    inputting the combined convolution outputs to the deep neural network trained jointly with the multiple filters; and

    providing a transcription for the utterance that is determined based at least on output that the deep neural network provides in response to receiving the combined convolution outputs.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×