×

Integrated local and cloud based speech recognition

  • US 8,660,847 B2
  • Filed: 09/02/2011
  • Issued: 02/25/2014
  • Est. Priority Date: 09/02/2011
  • Status: Active Grant
First Claim
Patent Images

1. A method for performing speech recognition, comprising:

  • acquiring a plurality of audio signals from a plurality of microphones, each of the plurality of audio signals is associated with a different microphone of the plurality of microphones, the plurality of audio signals is associated with a first environment;

    determining one or more directions within the first environment, the first environment includes one or more persons, each of the one or more directions is associated with a different person of the one or more persons;

    acquiring one or more images of the first environment using a capture device, the plurality of audio signals are associated with the first environment during a first period of time, the one or more images are associated with the first environment during the first period of time, the one or more images include one or more depth images, the determining one or more directions includes performing skeletal tracking based on the one or more images for each of the one or more persons;

    generating one or more audio recordings based on the plurality of audio signals, a first audio recording of the one or more audio recordings is generated by applying audio signal processing techniques to the plurality of audio signals such that sounds originating from a first direction of the one or more directions are amplified while other sounds originating from one or more other directions are attenuated;

    performing local speech recognition on each of the one or more audio recordings, the performing local speech recognition includes detecting a first utterance and detecting one or more keywords within the first utterance, the first utterance is detected by applying one or more speech detection techniques to the first audio recording of the one or more audio recordings;

    transmitting the first utterance and the one or more keywords to a second computing device, the second computing device performs a speech recognition technique on the first utterance, the speech recognition technique detects one or more words within the first utterance; and

    receiving a first response from the second computing device based on the first utterance.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×