Hotword suppression
First Claim
1. A computer-implemented method comprising:
- receiving, by a computing device, audio data corresponding to playback of an utterance;
providing, by the computing device, the audio data as an input to a model (i) that is configured to determine whether a given audio data sample includes an audio watermark and (ii) that was trained using watermarked audio data samples that each include an audio watermark sample and non-watermarked audio data samples that do not each include an audio watermark sample;
receiving, by the computing device and from the model (i) that is configured to determine whether the given audio data sample includes the audio watermark and (ii) that was trained using the watermarked audio data samples that include the audio watermark and the non-watermarked audio data samples that do not include the audio watermark, data indicating whether the audio data includes the audio watermark; and
based on the data indicating whether the audio data includes the audio watermark, determining, by the computing device, to continue or cease processing of the audio data.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for suppressing hotwords are disclosed. In one aspect, a method includes the actions of receiving audio data corresponding to playback of an utterance. The actions further include providing the audio data as an input to a model (i) that is configured to determine whether a given audio data sample includes an audio watermark and (ii) that was trained using watermarked audio data samples that each include an audio watermark sample and non-watermarked audio data samples that do not each include an audio watermark sample. The actions further include receiving, from the model, data indicating whether the audio data includes the audio watermark. The actions further include, based on the data indicating whether the audio data includes the audio watermark, determining to continue or cease processing of the audio data.
108 Citations
20 Claims
-
1. A computer-implemented method comprising:
-
receiving, by a computing device, audio data corresponding to playback of an utterance; providing, by the computing device, the audio data as an input to a model (i) that is configured to determine whether a given audio data sample includes an audio watermark and (ii) that was trained using watermarked audio data samples that each include an audio watermark sample and non-watermarked audio data samples that do not each include an audio watermark sample; receiving, by the computing device and from the model (i) that is configured to determine whether the given audio data sample includes the audio watermark and (ii) that was trained using the watermarked audio data samples that include the audio watermark and the non-watermarked audio data samples that do not include the audio watermark, data indicating whether the audio data includes the audio watermark; and based on the data indicating whether the audio data includes the audio watermark, determining, by the computing device, to continue or cease processing of the audio data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A system comprising:
-
one or more computers; and one or more non-transitory storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising; receiving, by a computing device, audio data corresponding to playback of an utterance; providing, by the computing device, the audio data as an input to a model (i) that is configured to determine whether a given audio data sample includes an audio watermark and (ii) that was trained using watermarked audio data samples that each include an audio watermark sample and non-watermarked audio data samples that do not each include an audio watermark sample; receiving, by the computing device and from the model (i) that is configured to determine whether the given audio data sample includes the audio watermark and (ii) that was trained using the watermarked audio data samples that include the audio watermark and the non-watermarked audio data samples that do not include the audio watermark, data indicating whether the audio data includes the audio watermark; and based on the data indicating whether the audio data includes the audio watermark, determining, by the computing device, to continue or cease processing of the audio data. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
-
-
20. A non-transitory computer-readable medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising:
-
receiving, by a computing device, audio data corresponding to playback of an utterance; providing, by the computing device, the audio data as an input to a model (i) that is configured to determine whether a given audio data sample includes an audio watermark and (ii) that was trained using watermarked audio data samples that each include an audio watermark sample and non-watermarked audio data samples that do not each include an audio watermark sample; receiving, by the computing device and from the model (i) that is configured to determine whether the given audio data sample includes the audio watermark and (ii) that was trained using the watermarked audio data samples that include the audio watermark and the non-watermarked audio data samples that do not include the audio watermark, data indicating whether the audio data includes the audio watermark; and based on the data indicating whether the audio data includes the audio watermark, determining, by the computing device, to continue or cease processing of the audio data.
-
Specification