Key phrase detection with audio watermarking

US 10,276,175 B1
Filed: 11/28/2017
Issued: 04/30/2019
Est. Priority Date: 11/28/2017
Status: Active Grant

First Claim

Patent Images

1. A playback device comprising a speaker and one or more storage devices on which are stored instructions that are operable, when executed by the playback device, to cause the playback device to perform operations comprising:

receiving an audio data stream;

determining, before the audio data stream is output by the playback device, whether a portion of the audio data stream encodes a particular key phrase by analyzing the portion using an automated speech recognizer;

in response to determining that the portion of the audio data stream encodes the particular key phrase, modifying the audio data stream to include an audio watermark, where the audio watermark includes data specifying that a key phrase is encoded in the portion of the audio data stream; and

outputting the modified audio data stream through the speaker of the playback device.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using audio watermarks with key phrases. One of the methods includes receiving, by a playback device, an audio data stream; determining, before the audio data stream is output by the playback device, whether a portion of the audio data stream encodes a particular key phrase by analyzing the portion using an automated speech recognizer; in response to determining that the portion of the audio data stream encodes the particular key phrase, modifying the audio data stream to include an audio watermark; and providing the modified audio data stream for output.

Citations

19 Claims

1. A playback device comprising a speaker and one or more storage devices on which are stored instructions that are operable, when executed by the playback device, to cause the playback device to perform operations comprising:
- receiving an audio data stream;
  
  determining, before the audio data stream is output by the playback device, whether a portion of the audio data stream encodes a particular key phrase by analyzing the portion using an automated speech recognizer;
  
  in response to determining that the portion of the audio data stream encodes the particular key phrase, modifying the audio data stream to include an audio watermark, where the audio watermark includes data specifying that a key phrase is encoded in the portion of the audio data stream; and
  
  outputting the modified audio data stream through the speaker of the playback device.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
- - 2. The playback device of claim 1, wherein modifying the audio data stream to include the audio watermark comprises:
    - determining whether the received audio data stream includes a watermark for the particular key phrase; and
      
      in response to determining that the received audio data stream does not include a watermark for the particular key phrase, modifying the audio data stream to include an audio watermark.
  - 3. The playback device of claim 1, wherein modifying the audio data stream to include the audio watermark comprises:
    - determining whether the received audio data stream includes a watermark for the particular key phrase;
      
      in response to determining that the received audio data stream includes a watermark for the particular key phrase, determining whether specific data is encoded in the watermark by analyzing data encoded in the watermark; and
      
      in response to determining that specific data is not encoded in the watermark, modifying the audio data stream to include the audio watermark that encodes the specific data.
  - 4. The playback device of claim 3, wherein modifying the audio data stream to include the audio watermark that encodes the specific data comprises modifying the watermark from the received audio data stream to encode the specific data.
  - 5. The playback device of claim 3, wherein the specific data comprises data for the particular key phrase.
  - 6. The playback device of claim 3, wherein the specific data comprises data for a source of the audio data stream.
  - 7. The playback device of claim 3, wherein the specific data comprises data about content encoded in the audio data stream.
  - 8. The playback device of claim 1, the operations comprising:
    - receiving another portion of the audio data stream concurrently with determining, before the audio data stream is played by the playback device, whether the portion of the audio data stream encodes the particular key phrase by analyzing the portion using the automated speech recognizer.
  - 9. The playback device of claim 1, wherein the particular key phrase is fixed.
  - 10. The playback device of claim 1, the operations comprising:
    - receiving input defining the particular key phrase prior to determining, before the audio data stream is played by the playback device, whether the portion of the audio data stream encodes the particular key phrase by analyzing the portion using the automated speech recognizer.
  - 11. The playback device of claim 1, wherein receiving the audio data stream comprises receiving the audio data stream through a wired or wireless input connection other than a microphone prior to providing the portion of the modified audio data stream for output.
  - 12. The playback device of claim 1, wherein modifying the audio data stream to include the audio watermark comprises modifying the audio data stream to include the audio watermark that identifies a source of the audio data stream.
  - 13. The playback device of claim 1, wherein modifying the audio data stream to include the audio watermark comprises modifying the audio data stream to include the audio watermark that includes data specifying that the particular key phrase is encoded in the portion of the audio data stream.

14. A non-transitory computer storage medium encoded with instructions that, when executed by one or more computers, cause the one or more computers to perform operations comprising:
- receiving an audio data stream;
  
  determining, before the audio data stream is output by the one or more computers, whether a portion of the audio data stream encodes a particular key phrase by analyzing the portion using an automated speech recognizer;
  
  in response to determining that the portion of the audio data stream does not encode the particular key phrase, determining to skip modifying the audio data stream to include an audio watermark based on the portion of the audio data stream that does not encode the particular key phrase, where the audio watermark includes data specifying that a key phrase is encoded in the portion of the audio data stream; and
  
  after determining to skip modifying the audio data stream to include the audio watermark based on the portion of the audio data stream that does not encode the particular key phrase, outputting the audio data stream through a speaker of a playback device.
- View Dependent Claims (15, 16)
- - 15. The computer storage medium of claim 14, the operations comprising:
    - determining, before the audio data stream is output by the one or more computers, whether a second portion of the audio data stream encodes an occurrence of the particular key phrase by analyzing the second portion using the automated speech recognizer;
      
      in response to determining that the second portion of the audio data stream encodes the particular key phrase, determining whether the received audio data stream includes a watermark for the occurrence of the particular key phrase;
      
      in response to determining that the received audio data stream includes a watermark for the occurrence of the particular key phrase, determining whether specific data is encoded in the watermark by analyzing data encoded in the watermark; and
      
      in response to determining that specific data is not encoded in the watermark, modifying the audio data stream to include the audio watermark that encodes the specific data.
  - 16. The computer storage medium of claim 14, the operations comprising:
    - determining, before the audio data stream is output by the one or more computers, whether a second portion of the audio data stream encodes an occurrence of the particular key phrase by analyzing the second portion using the automated speech recognizer;
      
      in response to determining that the second portion of the audio data stream encodes the particular key phrase, determining whether the received audio data stream includes a watermark for the occurrence of the particular key phrase; and
      
      in response to determining that the received audio data stream includes a watermark for the occurrence of the particular key phrase, determining to skip modifying the audio data stream to include the audio watermark based on the occurrence of the particular key phrase.

17. A computer-implemented method comprising:
- receiving, by a playback device, an audio data stream;
  
  determining, before the audio data stream is output by the playback device, whether a portion of the audio data stream encodes a particular key phrase by analyzing the portion using an automated speech recognizer;
  
  in response to determining that the portion of the audio data stream encodes the particular key phrase, modifying the audio data stream to include an audio watermark, where the audio watermark includes data specifying that a key phrase is encoded in the portion of the audio data stream; and
  
  outputting, by the playback device, the modified audio data stream through a speaker of the playback device.
- View Dependent Claims (18, 19)
- - 18. The method of claim 17, wherein modifying the audio data stream to include the audio watermark comprises:
    - determining whether the received audio data stream includes a watermark for the particular key phrase; and
      
      in response to determining that the received audio data stream does not include a watermark for the particular key phrase, modifying the audio data stream to include an audio watermark.
  - 19. The method of claim 17, wherein modifying the audio data stream to include the audio watermark comprises:
    - determining whether the received audio data stream includes a watermark for the particular key phrase;
      
      in response to determining that the received audio data stream includes a watermark for the particular key phrase, determining whether specific data is encoded in the watermark by analyzing data encoded in the watermark; and
      
      in response to determining that specific data is not encoded in the watermark, modifying the audio data stream to include the audio watermark that encodes the specific data.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google LLC (Alphabet Inc.)
Inventors
Garcia, Ricardo Antonio
Primary Examiner(s)
Elbin, Jesse A

Application Number

US15/824,183
Time in Patent Office

518 Days
Field of Search
US Class Current
CPC Class Codes

G06F 21/31   User authentication

G06F 3/165   Management of the audio str...

G10L 15/08   Speech classification or se...

G10L 15/22   Procedures used during a sp...

G10L 19/018   Audio watermarking, i.e. em...

G10L 2015/088   Word spotting

G10L 2015/223   Execution procedure of a sp...

G10L 21/00   Speech or voice signal proc...

H04N 21/233   Processing of audio element...

H04N 21/8358   involving watermark protect...

Key phrase detection with audio watermarking

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

Key phrase detection with audio watermarking

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links