Methods and devices for selectively ignoring captured audio data
First Claim
1. A method for selectively ignoring a set of temporally related sounds that is represented by data stored in memory on an electronic device, the method comprising:
- receiving, by the electronic device, audio data representing a word;
receiving a word identifier with the audio data, the word identifier being unique to the word;
receiving a data tag with the audio data, the data tag indicating a start time and an end time for the word within the audio data;
determining that the word identifier is associated with a wakeword that is a series of temporally-related sounds that, when received by a microphone of the electronic device, causes functionality of the electronic device to be activated;
determining a time window during which the word is to be outputted by a speaker of the electronic device by calculating an amount of time between the start time and the end time;
outputting the audio data using the speaker;
determining a hardware delay time associated with processing the audio data for playback, wherein determining the hardware delay time comprises;
determining an output time that the audio data begins to be outputted by the speaker; and
calculating a time difference between a processing time that the audio data begins to be processed for audio playback and the output time;
receiving audio input data using the microphone;
determining an echoing offset time for echoes subsequent to the audio data outputted by the speaker also being detected by the microphone, wherein determining the echoing offset time comprises;
determining an audio receipt time that audio input data is captured by the microphone; and
calculating another time difference between the output time and the audio receipt time;
determining a modified time window by applying the hardware delay time and the echoing offset time to the time window;
determining that a portion of the audio input data represents the wakeword;
determining that a detected time that the portion is detected by the microphone is within the modified time window; and
ignoring the portion such that functionality triggered by the wakeword remains inactive.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and methods for selectively ignoring an occurrence of a wakeword within audio input data is provided herein. In some embodiments, a wakeword may be detected to have been uttered by an individual within a modified time window, which may account for hardware delays and echoing offsets. The detected wakeword that occurs during this modified time window may, in some embodiments, correspond to a word included within audio that is outputted by a voice activated electronic device. This may cause the voice activated electronic device to activate itself, stopping the audio from being outputted. By identifying when these occurrences of the wakeword within outputted audio are going to happen, the voice activated electronic device may selectively determine when to ignore the wakeword, and furthermore, when not to ignore the wakeword.
381 Citations
20 Claims
-
1. A method for selectively ignoring a set of temporally related sounds that is represented by data stored in memory on an electronic device, the method comprising:
-
receiving, by the electronic device, audio data representing a word; receiving a word identifier with the audio data, the word identifier being unique to the word; receiving a data tag with the audio data, the data tag indicating a start time and an end time for the word within the audio data; determining that the word identifier is associated with a wakeword that is a series of temporally-related sounds that, when received by a microphone of the electronic device, causes functionality of the electronic device to be activated; determining a time window during which the word is to be outputted by a speaker of the electronic device by calculating an amount of time between the start time and the end time; outputting the audio data using the speaker; determining a hardware delay time associated with processing the audio data for playback, wherein determining the hardware delay time comprises; determining an output time that the audio data begins to be outputted by the speaker; and calculating a time difference between a processing time that the audio data begins to be processed for audio playback and the output time; receiving audio input data using the microphone; determining an echoing offset time for echoes subsequent to the audio data outputted by the speaker also being detected by the microphone, wherein determining the echoing offset time comprises; determining an audio receipt time that audio input data is captured by the microphone; and calculating another time difference between the output time and the audio receipt time; determining a modified time window by applying the hardware delay time and the echoing offset time to the time window; determining that a portion of the audio input data represents the wakeword; determining that a detected time that the portion is detected by the microphone is within the modified time window; and ignoring the portion such that functionality triggered by the wakeword remains inactive. - View Dependent Claims (2, 3, 4)
-
-
5. A method for selectively ignoring a portion of captured audio, the method comprising:
-
receiving, by an electronic device, audio data; receiving, by the electronic device, a data tag associated with a sound to be output based, at least in part, on the audio data; determining that the sound is a trigger for the electronic device; determining, based at least in part on the data tag, a time window that the trigger is to be outputted by the audio data; generating a modified time window based at least in part on at least one offset and the time window; causing the audio data to be outputted from at least one speaker; receiving audio input data; determining that the audio input data includes an occurrence of the trigger; determining that a time of the occurrence is during the modified time window; and ignoring a portion of the audio input data received during to the modified time window. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12)
-
-
13. An electronic device, comprising:
-
communications circuitry that receives audio data and a data tag associated with a sound to be output based, at least in part, on the audio data; at least one speaker that outputs the audio data; at least one audio input device that receives audio input data; memory that stores a trigger that activates the device; and at least one processor operable to; determine that the sound is the trigger; determine, based at least in part on the data tag, a time window that the trigger is to be outputted by the audio data; generate a modified time window based at least in part on at least one offset and the time window; determine that the audio input data received by the at least one audio input device includes an occurrence of the trigger; determine that a time of the occurrence is during the modified time window; and ignore a portion of the audio input data received during the modified time window. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
Specification