Hotword detection on multiple devices

US 10,347,253 B2
Filed: 04/23/2018
Issued: 07/09/2019
Est. Priority Date: 10/09/2014
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method comprising:

receiving, by a computing device that is configured to process voice commands that are preceded by a predefined hotword, first audio data of an utterance of a voice command that is preceded by the predefined hotword;

receiving, by the computing device, second audio data;

determining, by the computing device, that the second audio data includes a frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword;

in response to determining that the second audio data includes the frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword, generating, by the computing device, a command for the computing device to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword; and

in response to the command to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword, (i) placing the computing device into a sleep mode, (ii) bypassing, by the computing device, further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword, (iii) bypassing, by the computing device, emitting third audio data that includes the frequency pattern, and (iv) bypassing, by the computing device, outputting a visual indication that the computing device is processing the first audio data.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword detection on multiple devices are disclosed. In one aspect, a method includes the actions of receiving, by a computing device, audio data that corresponds to an utterance. The actions further include determining a likelihood that the utterance includes a hotword. The actions further include determining a loudness score for the audio data. The actions further include based on the loudness score, determining an amount of delay time. The actions further include, after the amount of delay time has elapsed, transmitting a signal that indicates that the computing device will initiate speech recognition processing on the audio data.

48 Citations

View as Search Results

20 Claims

1. A computer-implemented method comprising:
- receiving, by a computing device that is configured to process voice commands that are preceded by a predefined hotword, first audio data of an utterance of a voice command that is preceded by the predefined hotword;
  
  receiving, by the computing device, second audio data;
  
  determining, by the computing device, that the second audio data includes a frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword;
  
  in response to determining that the second audio data includes the frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword, generating, by the computing device, a command for the computing device to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword; and
  
  in response to the command to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword, (i) placing the computing device into a sleep mode, (ii) bypassing, by the computing device, further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword, (iii) bypassing, by the computing device, emitting third audio data that includes the frequency pattern, and (iv) bypassing, by the computing device, outputting a visual indication that the computing device is processing the first audio data.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, wherein:
    - determining that the second audio data includes the frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword comprises;
      
      determining that the second audio data includes the frequency pattern indicating that the computing device is to bypass performing speech recognition on the first audio data of the utterance of the voice command that is preceded by the predefined hotword, andbypassing further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword comprises;
      
      in response to determining that the second audio data includes the frequency pattern indicating that the computing device is to bypass performing speech recognition on the first audio data of the utterance of the voice command that is preceded by the predefined hotword, bypassing performing speech recognition on the first audio data of the utterance of the voice command that is preceded by the predefined hotword.
  - 3. The method of claim 1, comprising:
    - determining, by the computing device, that the first audio data of the utterance of the voice command that is preceded by the predefined hotword includes the predefined hotword,wherein determining that the second audio data includes the frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword is based on determining that the first audio data of the utterance of the voice command that is preceded by the predefined hotword includes the predefined hotword, andwherein bypassing further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword is based on determining that the first audio data of the utterance of the voice command that is preceded by the predefined hotword includes the predefined hotword.
  - 4. The method of claim 1, comprising:
    - before receiving the first audio data of the utterance of the voice command that is preceded by the predefined hotword and the second audio data, receiving an initial broadcast of the second audio data and data indicating to bypass processing of received audio data upon receipt of the second audio data.
  - 5. The method of claim 1, wherein:
    - receiving the second audio data comprises;
      
      receiving, by a radio of the computing device, a bit stream, anddetermining that the second audio data includes the frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword comprises;
      
      determining that the bit stream includes the frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword.
  - 6. The method of claim 1, wherein:
    - receiving the second audio data comprises;
      
      receiving, by a microphone of the computing device, an ultrasonic signal, anddetermining that the second audio data includes the frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword comprises;
      
      determining that the ultrasonic signal includes the frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword.
  - 7. The method of claim 1, comprising:
    - while receiving the first audio data of the utterance of the voice command that is preceded by the predefined hotword and the second audio data, performing, by the computing device, an operation; and
      
      in response to determining that the second audio data includes the frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword, continuing, by the computing device, performance of the operation.

8. A system comprising:
- one or more computers; and
  
  one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising;
  
  receiving, by a computing device that is configured to process voice commands that are preceded by a predefined hotword, first audio data of an utterance of a voice command that is preceded by the predefined hotword;
  
  receiving, by the computing device, second audio data;
  
  determining, by the computing device, that the second audio data includes a frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword;
  
  in response to determining that the second audio data includes the frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword, generating, by the computing device, a command for the computing device to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword; and
  
  in response to the command to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword, (i) placing the computing device into a sleep mode, (ii) bypassing, by the computing device, further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword, (iii) bypassing, by the computing device, emitting third audio data that includes the frequency pattern, and (iv) bypassing, by the computing device, outputting a visual indication that the computing device is processing the first audio data.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The system of claim 8, wherein:
    - determining that the second audio data includes the frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword comprises;
      
      determining that the second audio data includes the frequency pattern indicating that the computing device is to bypass performing speech recognition on the first audio data of the utterance of the voice command that is preceded by the predefined hotword, andbypassing further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword comprises;
      
      in response to determining that the second audio data includes the frequency pattern indicating that the computing device is to bypass performing speech recognition on the first audio data of the utterance of the voice command that is preceded by the predefined hotword, bypassing performing speech recognition on the first audio data of the utterance of the voice command that is preceded by the predefined hotword.
  - 10. The system of claim 8, wherein:
    - determining, by the computing device, that the first audio data of the utterance of the voice command that is preceded by the predefined hotword includes the predefined hotword,wherein determining that the second audio data includes the frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword is based on determining that the first audio data of the utterance of the voice command that is preceded by the predefined hotword includes the predefined hotword, andwherein bypassing further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword is based on determining that the first audio data of the utterance of the voice command that is preceded by the predefined hotword includes the predefined hotword.
  - 11. The system of claim 8, wherein the operations comprise:
    - before receiving the first audio data of the utterance of the voice command that is preceded by the predefined hotword and the second audio data, receiving an initial broadcast of the second audio data and data indicating to bypass processing of received audio data upon receipt of the second audio data.
  - 12. The system of claim 8, wherein:
    - receiving the second audio data comprises;
      
      receiving, by a radio of the computing device, a bit stream, anddetermining that the second audio data includes the frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword comprises;
      
      determining that the bit stream includes the frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword.
  - 13. The system of claim 8, wherein:
    - receiving the second audio data comprises;
      
      receiving, by a microphone of the computing device, an ultrasonic signal, anddetermining that the second audio data includes the frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword comprises;
      
      determining that the ultrasonic signal includes the frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword.
  - 14. The system of claim 8, wherein the operations comprise:
    - while receiving the first audio data of the utterance of the voice command that is preceded by the predefined hotword and the second audio data, performing, by the computing device, an operation; and
      
      in response to determining that the second audio data includes the frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword, continuing, by the computing device, performance of the operation.

15. A non-transitory computer-readable medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising:
- receiving, by a computing device that is configured to process voice commands that are preceded by a predefined hotword, first audio data of an utterance of a voice command that is preceded by the predefined hotword;
  
  receiving, by the computing device, second audio data;
  
  determining, by the computing device, that the second audio data includes a frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword;
  
  in response to determining that the second audio data includes the frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword, generating, by the computing device, a command for the computing device to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword; and
  
  in response to the command to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword, (i) placing the computing device into a sleep mode, (ii) bypassing, by the computing device, further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword, (iii) bypassing, by the computing device, emitting third audio data that includes the frequency pattern, and (iv) bypassing, by the computing device, outputting a visual indication that the computing device is processing the first audio data.
- View Dependent Claims (16, 17, 18, 19, 20)
- - 16. The computer-readable medium of claim 15, wherein:
    - determining that the second audio data includes the frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword comprises;
      
      determining that the second audio data includes the frequency pattern indicating that the computing device is to bypass performing speech recognition on the first audio data of the utterance of the voice command that is preceded by the predefined hotword, andbypassing further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword comprises;
      
      in response to determining that the second audio data includes the frequency pattern indicating that the computing device is to bypass performing speech recognition on the first audio data of the utterance of the voice command that is preceded by the predefined hotword, bypassing performing speech recognition on the first audio data of the utterance of the voice command that is preceded by the predefined hotword.
  - 17. The computer-readable medium of claim 15, wherein the operations comprise:
    - determining, by the computing device, that the first audio data of the utterance of the voice command that is preceded by the predefined hotword includes the predefined hotword,wherein determining that the second audio data includes the frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword is based on determining that the first audio data of the utterance of the voice command that is preceded by the predefined hotword includes the predefined hotword, andwherein bypassing further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword is based on determining that the first audio data of the utterance of the voice command that is preceded by the predefined hotword includes the predefined hotword.
  - 18. The computer-readable medium of claim 15, wherein the operations comprise:
    - before receiving the first audio data of the utterance of the voice command that is preceded by the predefined hotword and the second audio data, receiving an initial broadcast of the second audio data and data indicating to bypass processing of received audio data upon receipt of the second audio data.
  - 19. The computer-readable medium of claim 15, wherein:
    - receiving the second audio data comprises;
      
      receiving, by a radio of the computing device, a bit stream, anddetermining that the second audio data includes the frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword comprises;
      
      determining that the bit stream includes the frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword.
  - 20. The computer-readable medium of claim 15, wherein:
    - receiving the second audio data comprises;
      
      receiving, by a microphone of the computing device, an ultrasonic signal, anddetermining that the second audio data includes the frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword comprises;
      
      determining that the ultrasonic signal includes the frequency pattern indicating that the computing device is to bypass further processing of the first audio data of the utterance of the voice command that is preceded by the predefined hotword.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google LLC (Alphabet Inc.)
Inventors
Foerster, Jakob Nicolaus, Gruenstein, Alexander H.
Primary Examiner(s)
Hang, Vu B

Application Number

US15/959,508
Publication Number

US 20180315424A1
Time in Patent Office

442 Days
Field of Search
US Class Current
CPC Class Codes

G10L 15/02   Feature extraction for spee...

G10L 15/08   Speech classification or se...

G10L 15/22   Procedures used during a sp...

G10L 15/26   Speech to text systems G10L...

G10L 15/30   Distributed recognition, e....

G10L 2015/088   Word spotting

G10L 2015/223   Execution procedure of a sp...

G10L 2025/783   based on threshold decision

G10L 25/03   characterised by the type o...

G10L 25/78   Detection of presence or ab...

G10L 25/87   Detection of discrete point...

Hotword detection on multiple devices

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

48 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Hotword detection on multiple devices

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

48 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links