×

Speech and noise models for speech recognition

  • US 8,666,740 B2
  • Filed: 06/22/2012
  • Issued: 03/04/2014
  • Est. Priority Date: 06/14/2010
  • Status: Active Grant
First Claim
Patent Images

1. A system comprising:

  • one or more processing devices; and

    one or more storage devices storing instructions that, when executed by the one or more processing devices, cause the system to;

    receive an audio signal generated by a device based on audio input from a user, the audio signal including background audio and one or more user utterances recorded by the device, the audio signal comprising a portion that includes background audio without utterances of the user;

    identify the user or the device based on an identifier for the user or the device;

    determine a location of the user when the one or more utterances are recorded;

    determine that a set of stored noise models does not include an adapted noise model that is adapted for the user or the device and that is associated with both (i) the determined location and (ii) the identifier for the user or the device;

    select a surrogate noise model in response to determining that the set of stored noise models does not include an adapted noise model that is adapted for the user or the device and that is associated with both (i) the determined location and (ii) the identifier for the user or the device;

    generate a filtered audio signal with reduced background audio compared to the received audio signal using the selected surrogate noise model;

    adapt the surrogate noise model based on the received audio signal to generate a first adapted noise model that models characteristics of background audio surrounding the user at the location; and

    store the first adapted noise model as one of a plurality of adapted noise models that are specific to the identified user or device, each of the plurality of adapted noise models being associated with a different corresponding location, each of the plurality of adapted noise models having been adapted based on audio recorded by the device at the corresponding location with which the adapted noise model is associated, each of the plurality of adapted noise models being stored in association with the identifier for the user or the device and with its corresponding location such that when utterances of the user are recorded by the device at the different corresponding locations, the adapted noise model associated with the location where the device recorded the utterances is used to recognize the utterances of the user.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×