Providing an indication of the suitability of speech recognition

US 10,453,443 B2
Filed: 08/22/2018
Issued: 10/22/2019
Est. Priority Date: 09/30/2014
Status: Active Grant

First Claim

Patent Images

1. A non-transitory computer-readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors of an electronic device, cause the electronic device to:

receive speech input from a user;

determine whether the speech input includes a spoken trigger;

in response to determining that the speech input includes a spoken trigger, obtain audio input from an acoustic environment;

while producing speech recognition results by performing speech recognition on the audio input, determine a speech recognition suitability value based on the audio input; and

in accordance with a determination that the speech recognition suitability value does not satisfy a predetermined criterion, provide an output to indicate that the acoustic environment is not suitable for performing speech recognition.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

This relates to providing an indication of the suitability of an acoustic environment for performing speech recognition. One process can include receiving an audio input and determining a speech recognition suitability based on the audio input. The speech recognition suitability can include a numerical, textual, graphical, or other representation of the suitability of an acoustic environment for performing speech recognition. The process can further include displaying a visual representation of the speech recognition suitability to indicate the likelihood that a spoken user input will be interpreted correctly. This allows a user to determine whether to proceed with the performance of a speech recognition process, or to move to a different location having a better acoustic environment before performing the speech recognition process. In some examples, the user device can disable operation of a speech recognition process in response to determining that the speech recognition suitability is below a threshold suitability.

Citations

54 Claims

1. A non-transitory computer-readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors of an electronic device, cause the electronic device to:
- receive speech input from a user;
  
  determine whether the speech input includes a spoken trigger;
  
  in response to determining that the speech input includes a spoken trigger, obtain audio input from an acoustic environment;
  
  while producing speech recognition results by performing speech recognition on the audio input, determine a speech recognition suitability value based on the audio input; and
  
  in accordance with a determination that the speech recognition suitability value does not satisfy a predetermined criterion, provide an output to indicate that the acoustic environment is not suitable for performing speech recognition.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
- - 2. The storage medium of claim 1, wherein the output includes a haptic output.
  - 3. The storage medium of claim 1, wherein the output includes an audio output indicating that the acoustic environment is not suitable for performing speech recognition.
  - 4. The storage medium of claim 1, wherein the instructions further cause the electronic device to:
    - determine whether an option for determining speech recognition suitability is selected; and
      
      in accordance with a determination that the option for determining speech recognition suitability is selected, determine the speech recognition suitability value based on the audio input.
  - 5. The storage medium of claim 1, wherein determining the speech recognition suitability value based on the audio input comprises:
    - determining one or more characteristics of the acoustic environment based on the audio input; and
      
      determining the speech recognition suitability value based on the one or more characteristics of the acoustic environment.
  - 6. The storage medium of claim 5, wherein the one or more characteristics of the acoustic environment comprise a signal to noise ratio for a first frequency band of the acoustic environment.
  - 7. The storage medium of claim 6, wherein the one or more characteristics of the acoustic environment comprise a type of noise detected in the first frequency band.
  - 8. The storage medium of claim 6, wherein the one or more characteristics of the acoustic environment comprise a signal to noise ratio for a second frequency band of the acoustic environment.
  - 9. The storage medium of claim 8, wherein the one or more characteristics of the acoustic environment comprise a type of noise detected in the second frequency band.
  - 10. The storage medium of claim 5, wherein the one or more characteristics of the acoustic environment comprise a number of transient noises detected in a buffer comprising previously recorded audio of the acoustic environment.
  - 11. The storage medium of claim 5, wherein determining the speech recognition suitability value comprises:
    - determining a speech recognition suitability vector based on the audio input, wherein the speech recognition suitability vector comprises one or more elements that represent the one or more characteristics of the acoustic environment; and
      
      using a neural network to determine the speech recognition suitability value based on the speech recognition suitability vector.
  - 12. The storage medium of claim 1, wherein the instructions further cause the electronic device to:
    - display an icon associated with speech recognition;
      
      receive a user selection of the icon associated with speech recognition;
      
      in accordance with a determination that the speech recognition suitability value is not less than a threshold value, perform speech recognition on an audio input received subsequent to receiving the user selection; and
      
      in accordance with a determination that the speech recognition suitability value is less than the threshold value, forgo the performance of speech recognition on the audio input received subsequent to receiving the user selection.
  - 13. The storage medium of claim 1, wherein the output includes a visual representation of the speech recognition suitability value.
  - 14. The storage medium of claim 13, wherein the visual representation comprises one or more bars, and wherein a value of the speech recognition suitability value is represented by a number of the one or more bars.
  - 15. The storage medium of claim 13, wherein the visual representation comprises an icon, and wherein the speech recognition suitability value is represented by a color of the icon.
  - 16. The storage medium of claim 15, wherein the instructions further cause the electronic device to:
    - determine whether the speech recognition suitability value is less than a threshold value;
      
      in accordance with a determination that the speech recognition suitability value is less than the threshold value, display the icon in a grayed out state; and
      
      in accordance with a determination that the speech recognition suitability value is not less than the threshold value, display the icon in a non-grayed out state.
  - 17. The storage medium of claim 1, wherein the visual representation comprises a textual representation of the speech recognition suitability value.
  - 18. The storage medium of claim 1, wherein the instructions further cause the electronic device to:
    - determine whether the speech recognition suitability value is less than a threshold value; and
      
      in accordance with a determination that the speech recognition suitability value is less than the threshold value, output a message indicating a low suitability of the acoustic environment for speech recognition.

19. An electronic device, comprising:
- one or more processors;
  
  a memory; and
  
  one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for;
  
  receiving speech input from a user;
  
  determining whether the speech input includes a spoken trigger;
  
  in response to determining that the speech input includes a spoken trigger, obtaining audio input from an acoustic environment;
  
  while producing speech recognition results by performing speech recognition on the audio input, determining a speech recognition suitability value based on the audio input; and
  
  in accordance with a determination that the speech recognition suitability value does not satisfy a predetermined criterion, providing an output to indicate that the acoustic environment is not suitable for performing speech recognition.
- View Dependent Claims (21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37)
- - 21. The electronic device of claim 19, wherein the output includes a haptic output.
  - 22. The electronic device of claim 19, wherein the output includes an audio output indicating that the acoustic environment is not suitable for performing speech recognition.
  - 23. The electronic device of claim 19, wherein the programs further include instructions for:
    - determining whether an option for determining speech recognition suitability is selected; and
      
      in accordance with a determination that the option for determining speech recognition suitability is selected, determining the speech recognition suitability value based on the audio input.
  - 24. The electronic device of claim 19, wherein determining the speech recognition suitability value based on the audio input comprises:
    - determining one or more characteristics of the acoustic environment based on the audio input; and
      
      determining the speech recognition suitability value based on the one or more characteristics of the acoustic environment.
  - 25. The electronic device of claim 24, wherein the one or more characteristics of the acoustic environment comprise a signal to noise ratio for a first frequency band of the acoustic environment.
  - 26. The electronic device of claim 25, wherein the one or more characteristics of the acoustic environment comprise a type of noise detected in the first frequency band.
  - 27. The electronic device of claim 25, wherein the one or more characteristics of the acoustic environment comprise a signal to noise ratio for a second frequency band of the acoustic environment.
  - 28. The electronic device of claim 27, wherein the one or more characteristics of the acoustic environment comprise a type of noise detected in the second frequency band.
  - 29. The electronic device of claim 24, wherein the one or more characteristics of the acoustic environment comprise a number of transient noises detected in a buffer comprising previously recorded audio of the acoustic environment.
  - 30. The electronic device of claim 24, wherein determining the speech recognition suitability value comprises:
    - determining a speech recognition suitability vector based on the audio input, wherein the speech recognition suitability vector comprises one or more elements that represent the one or more characteristics of the acoustic environment; and
      
      using a neural network to determine the speech recognition suitability value based on the speech recognition suitability vector.
  - 31. The electronic device of claim 19, wherein the programs further include instructions for:
    - displaying an icon associated with speech recognition;
      
      receiving a user selection of the icon associated with speech recognition;
      
      in accordance with a determination that the speech recognition suitability value is not less than a threshold value, performing speech recognition on an audio input received subsequent to receiving the user selection; and
      
      in accordance with a determination that the speech recognition suitability value is less than the threshold value, forgoing the performance of speech recognition on the audio input received subsequent to receiving the user selection.
  - 32. The electronic device of claim 19, wherein the output includes a visual representation of the speech recognition suitability value.
  - 33. The electronic device of claim 32, wherein the visual representation comprises one or more bars, and wherein a value of the speech recognition suitability value is represented by a number of the one or more bars.
  - 34. The electronic device of claim 32, wherein the visual representation comprises an icon, and wherein the speech recognition suitability value is represented by a color of the icon.
  - 35. The electronic device of claim 34, wherein the programs further include instructions for:
    - determining whether the speech recognition suitability value is less than a threshold value;
      
      in accordance with a determination that the speech recognition suitability value is less than the threshold value, displaying the icon in a grayed out state; and
      
      in accordance with a determination that the speech recognition suitability value is not less than the threshold value, displaying the icon in a non-grayed out state.
  - 36. The electronic device of claim 19, wherein the visual representation comprises a textual representation of the speech recognition suitability value.
  - 37. The electronic device of claim 19, wherein the programs further include instructions for:
    - determining whether the speech recognition suitability value is less than a threshold value; and
      
      in accordance with a determination that the speech recognition suitability value is less than the threshold value, outputting a message indicating a low suitability of the acoustic environment for speech recognition.

20. A method, comprising:
- at an electronic device with one or more processors and memory;
  
  receiving speech input from a user;
  
  determining whether the speech input includes a spoken trigger;
  
  in response to determining that the speech input includes a spoken trigger, obtaining audio input from an acoustic environment;
  
  while producing speech recognition results by performing speech recognition on the audio input, determining a speech recognition suitability value based on the audio input; and
  
  in accordance with a determination that the speech recognition suitability value does not satisfy a predetermined criterion, providing an output to indicate that the acoustic environment is not suitable for performing speech recognition.
- View Dependent Claims (38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54)
- - 38. The method of claim 20, wherein the output includes a haptic output.
  - 39. The method of claim 20, wherein the output includes an audio output indicating that the acoustic environment is not suitable for performing speech recognition.
  - 40. The method of claim 20, comprising:
    - determining whether an option for determining speech recognition suitability is selected; and
      
      in accordance with a determination that the option for determining speech recognition suitability is selected, determining the speech recognition suitability value based on the audio input.
  - 41. The method of claim 20, wherein determining the speech recognition suitability value based on the audio input comprises:
    - determining one or more characteristics of the acoustic environment based on the audio input; and
      
      determining the speech recognition suitability value based on the one or more characteristics of the acoustic environment.
  - 42. The method of claim 41, wherein the one or more characteristics of the acoustic environment comprise a signal to noise ratio for a first frequency band of the acoustic environment.
  - 43. The method of claim 42, wherein the one or more characteristics of the acoustic environment comprise a type of noise detected in the first frequency band.
  - 44. The method of claim 42, wherein the one or more characteristics of the acoustic environment comprise a signal to noise ratio for a second frequency band of the acoustic environment.
  - 45. The method of claim 44, wherein the one or more characteristics of the acoustic environment comprise a type of noise detected in the second frequency band.
  - 46. The method of claim 41, wherein the one or more characteristics of the acoustic environment comprise a number of transient noises detected in a buffer comprising previously recorded audio of the acoustic environment.
  - 47. The method of claim 41, wherein determining the speech recognition suitability value comprises:
    - determining a speech recognition suitability vector based on the audio input, wherein the speech recognition suitability vector comprises one or more elements that represent the one or more characteristics of the acoustic environment; and
      
      using a neural network to determine the speech recognition suitability value based on the speech recognition suitability vector.
  - 48. The method of claim 20, comprising:
    - displaying an icon associated with speech recognition;
      
      receiving a user selection of the icon associated with speech recognition;
      
      in accordance with a determination that the speech recognition suitability value is not less than a threshold value, performing speech recognition on an audio input received subsequent to receiving the user selection; and
      
      in accordance with a determination that the speech recognition suitability value is less than the threshold value, forgoing the performance of speech recognition on the audio input received subsequent to receiving the user selection.
  - 49. The method of claim 20, wherein the output includes a visual representation of the speech recognition suitability value.
  - 50. The method of claim 49, wherein the visual representation comprises one or more bars, and wherein a value of the speech recognition suitability value is represented by a number of the one or more bars.
  - 51. The method of claim 49, wherein the visual representation comprises an icon, and wherein the speech recognition suitability value is represented by a color of the icon.
  - 52. The method of claim 51, wherein the programs further include instructions for:
    - determining whether the speech recognition suitability value is less than a threshold value;
      
      in accordance with a determination that the speech recognition suitability value is less than the threshold value, displaying the icon in a grayed out state; and
      
      in accordance with a determination that the speech recognition suitability value is not less than the threshold value, displaying the icon in a non-grayed out state.
  - 53. The method of claim 20, wherein the visual representation comprises a textual representation of the speech recognition suitability value.
  - 54. The method of claim 20, comprising:
    - determining whether the speech recognition suitability value is less than a threshold value; and
      
      in accordance with a determination that the speech recognition suitability value is less than the threshold value, outputting a message indicating a low suitability of the acoustic environment for speech recognition.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Apple Inc.
Original Assignee
Apple Inc.
Inventors
Kim, Yoon
Primary Examiner(s)
Yu, Norman

Application Number

US16/109,546
Publication Number

US 20180366105A1
Time in Patent Office

426 Days
Field of Search

381 56, 381 57, 381 58, 381306, 381 95, 381110, 381108, 381122, 381388, 704231, 704233, 704275, 704226, 704E15039, 704E15001, 704E15009, 704E11003
US Class Current
CPC Class Codes

G10L 15/01   Assessment or evaluation of...

G10L 15/22   Procedures used during a sp...

G10L 25/60   for measuring the quality o...

H04R 29/008   Visual indication of indivi...

Providing an indication of the suitability of speech recognition

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

Citations

54 Claims

Specification

Solutions

Use Cases

Quick Links

Providing an indication of the suitability of speech recognition

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

54 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links