Applying neural network language models to weighted finite state transducers for automatic speech recognition

  • US 10,354,652 B2
  • Filed: 07/13/2018
  • Issued: 07/16/2019
  • Est. Priority Date: 12/02/2015
  • Status: Active Grant
  • ×
    • Pin Icon | RPX Insight
    • Pin
First Claim
Patent Images

1. A non-transitory computer-readable medium having instructions stored thereon;

  • the instructions, when executed by one or more processors, cause the one or more processors to;

    receive speech input;

    determine, based on the speech input and a weighted finite state transducer (WFST), a first probability of a candidate word given one or more history candidate words;

    negate, using a negating finite state transducer (FST), the first probability of the candidate word given the one or more history candidate words;

    compose a virtual FST using a neural network language model and based on the WFST, wherein one or more virtual states of the virtual FST represent the candidate word;

    determine, using the virtual FST, a second probability of the candidate word given the one or more history candidate words;

    determine, based on the WFST and the second probability of the candidate word given the one or more history candidate words, text corresponding to the speech input;

    based on the determined text, perform one or more tasks to obtain a result; and

    cause the result to be presented in spoken or visual form.

View all claims