Transitioning an electronic device between device states
First Claim
Patent Images
1. An apparatus comprising:
- a microphone;
a processor; and
computer-readable media storing computer-executable instructions that, when executed by the processor, cause the processor to perform acts comprising;
receiving, from the microphone, a first audio signal including a representation of a first utterance;
determining a first similarity score for the first utterance, wherein the first similarity score indicates a similarity between the representation of the first utterance and a representation of a defined word or phrase;
determining that the first similarity score is less than a first similarity threshold and greater than a second similarity threshold;
setting a temporary third similarity threshold for a defined amount of time, wherein the third similarity threshold is less than the first similarity threshold and greater than the second similarity threshold;
receiving, from the microphone and within the defined amount of time, a second audio signal including a representation of a second utterance;
causing speech-recognition to be performed on the second audio signal;
determining a second similarity score for the second utterance, wherein the second similarity score indicates a similarity between the representation of the second utterance and the representation of the defined word or phrase;
determining that the second similarity score is greater than the third similarity threshold; and
in response to the determining that the second similarity score is greater than the third similarity threshold, changing a state of the apparatus from a first state to a second state.
2 Assignments
0 Petitions
Accused Products
Abstract
This disclosure describes techniques for transitioning an electronic device between device states. In one example, a voice-controlled device is configured to transition from a low power state to an interactive state in response to identifying a user speaking a defined utterance. If, however, the device determines that the user has spoken an utterance that is close, but not equivalent to, the defined utterance, then the device may lower a threshold for subsequent speech such that the device is more likely to determine that the subsequent speech is equivalent to the defined utterance.
38 Citations
27 Claims
-
1. An apparatus comprising:
-
a microphone; a processor; and computer-readable media storing computer-executable instructions that, when executed by the processor, cause the processor to perform acts comprising; receiving, from the microphone, a first audio signal including a representation of a first utterance; determining a first similarity score for the first utterance, wherein the first similarity score indicates a similarity between the representation of the first utterance and a representation of a defined word or phrase; determining that the first similarity score is less than a first similarity threshold and greater than a second similarity threshold; setting a temporary third similarity threshold for a defined amount of time, wherein the third similarity threshold is less than the first similarity threshold and greater than the second similarity threshold; receiving, from the microphone and within the defined amount of time, a second audio signal including a representation of a second utterance; causing speech-recognition to be performed on the second audio signal; determining a second similarity score for the second utterance, wherein the second similarity score indicates a similarity between the representation of the second utterance and the representation of the defined word or phrase; determining that the second similarity score is greater than the third similarity threshold; and in response to the determining that the second similarity score is greater than the third similarity threshold, changing a state of the apparatus from a first state to a second state. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A non-transitory computer-readable storage medium storing instructions that when executed by a processor cause the processor to:
-
receive a first audio signal including a representation of a first utterance; determine a first similarity score for the first utterance, wherein the first similarity score indicates a similarity between the representation of the first utterance and a representation of a defined word or phrase; determine that the first similarity score does not satisfy a first similarity acceptance criterion and does satisfy a second similarity acceptance criterion; modify the first similarity acceptance criterion for a period of time; receive a second audio signal including a representation of a second utterance within the period of time; determine a second similarity score for the second utterance, wherein the second similarity score indicates a similarity between the representation of the second utterance and the representation of the defined word or phrase; and change a state of an electronic device based at least in part on a determination that the second similarity score satisfies the modified first similarity acceptance criterion. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A method implemented at least in part by an electronic device that is configured to transition from a first state to a second state in response to a received audio signal having a similarity to a representation of a defined word or phrase, the method comprising:
-
receiving, at the electronic device, the audio signal; determining an occurrence of an event while the electronic device is in the first state; modifying a similarity acceptance criterion for a period of time, based at least in part on the occurrence of the event; determining a similarity score between the audio signal and a representation of a defined word or phrase; and transitioning the electronic device from the first state to the second state based at least in part on a determination that the similarity score satisfies the modified similarity acceptance criterion. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
-
22. A non-transitory computer-readable media storing computer-executable instructions that, when executed by a processor, cause the processor to perform acts comprising:
-
receiving an audio signal comprising a representation of an utterance; monitoring for an occurrence of an event; at least partly in response to identifying the occurrence of the event, determining a similarity score for the utterance based on a comparison between the representation of the utterance and a representation of a defined word or phrase; modifying the similarity score; modifying a similarity acceptance criterion for a period of time; and transitioning an electronic device from a first state to a second state based at least in part on a determination that the modified similarity score satisfies the modified similarity acceptance criterion. - View Dependent Claims (23, 24, 25, 26, 27)
-
Specification