×

USER SPECIFIED KEYWORD SPOTTING USING LONG SHORT TERM MEMORY NEURAL NETWORK FEATURE EXTRACTOR

  • US 20160180838A1
  • Filed: 12/22/2014
  • Published: 06/23/2016
  • Est. Priority Date: 12/22/2014
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising:

  • receiving, by a device for each of multiple variable length enrollment audio signals each encoding a respective spoken utterance of an enrollment phrase, a respective plurality of enrollment feature vectors that represent features of the respective variable length enrollment audio signal, wherein when the device determines that another audio signal encodes another spoken utterance of the enrollment phrase, the device performs a particular action assigned to the enrollment phrase; and

    for each of the multiple variable length enrollment audio signals;

    processing each of the plurality of enrollment feature vectors for the respective variable length enrollment audio signal using a long short term memory (LSTM) neural network to generate a respective enrollment LSTM output vector for each enrollment feature vector; and

    generating, for the respective variable length enrollment audio signal, a template fixed length representation for use in determining whether the other audio signal encodes another spoken utterance of the enrollment phrase by combining at most a quantity k of the enrollment LSTM output vectors for the enrollment audio signal, wherein a predetermined length of each of the template fixed length representations is the same.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×