×

Hybridized client-server speech recognition

  • US 9,674,328 B2
  • Filed: 02/22/2012
  • Issued: 06/06/2017
  • Est. Priority Date: 02/22/2011
  • Status: Active Grant
First Claim
Patent Images

1. A computer program product comprising a non-transitory computer-readable storage medium storing instructions that, when executed by a computing system comprising at least one programmable processor, cause the computing system to perform operations comprising:

  • receiving, at a recipient computing device, a speech utterance to be processed by speech recognition;

    determining an amount of an available bandwidth between the recipient computing device and a separate computing device;

    segmenting, upon determining that the available bandwidth is sufficient, the speech utterance into two or more speech utterance segments, the segmenting comprising initially analyzing the speech utterance by identifying features of the speech utterance that can be more efficiently processed by the separate computing device than the recipient computing device, wherein initially analyzing comprises applying a dynamically adaptable acoustic model implemented at the recipient computing device, with the dynamically adaptable acoustic model adjusted based on locally available data at the recipient computing device including a user location and time, to determine a confidence score, and an audio quality metric for the two or more speech utterance segments;

    dynamically determining a confidence threshold value and an audio quality threshold value based on environmental conditions at which the recipient computing device is located, the environmental conditions comprising one or more of;

    a type of environment in which the recipient computing device is located, availability of noise cancelling devices at the recipient computing device, and number of microphones used by the recipient computing device;

    assigning each of the two or more speech utterance segments to one or more of a plurality of available speech recognizers, the assigning comprising;

    designating a first segment of the two or more speech utterance segments for processing by a first speech recognizer of the plurality of available speech recognizers that is implemented on the separate computing device than the recipient computing device, wherein designating the first segment is performed when at least one of the confidence score and the audio quality metric for the first segment, determined using the dynamically adaptable acoustic model adjusted based on the locally available data including the user location and the time, are below the respective confidence threshold value and the audio quality threshold value, anddesignating a second segment of the two or more speech utterance segments for processing by a second speech recognizer of the plurality of available speech recognizers that is implemented on the recipient computing device when another confidence score and another audio quality metric for the second segment, determined using the dynamically adaptable acoustic model adjusted based on the locally available data including the user location and the time, are above the respective confidence threshold value and the audio quality threshold value,wherein the identifying of the features of the speech utterance comprising determining processing speeds associated with the separate computing device and the recipient computing device, the available bandwidth, and a presence of a word or phrase capable of being efficiently modeled by a context-free grammar at the recipient computing device;

    sending the first segment from the recipient computing device to the separate computing device for processing;

    receiving first segment processing results back from the separate computing device, the sending and the receiving occurring via a data network;

    processing the second segment at the recipient computing device to generate second segment processing results; and

    returning a completed speech recognition result assembled from the first segment processing results and the second segment processing results.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×