DISTINGUISHABLE OPEN SOUNDS

US 20180122372A1
Filed: 10/31/2016
Published: 05/03/2018
Est. Priority Date: 10/31/2016
Status: Abandoned Application

First Claim

Patent Images

1. A non-transitory computer readable medium storing code that, when executed by one or more processors would cause the one or more processors to:

receive input indicative of a selection of one of a plurality of distinguishable open sounds to be used for indicating that a system is receptive to a user query;

capture audio through a microphone;

digitize the audio into audio samples;

perform sound spotting using a neural network algorithm on the audio samples, the neural network trained for a specific wake-up phrase;

in response to the neural network spotting the specific wake-up phrase, receive speech input through the microphone, the speech input including an audible user query;

further in response to spotting the specific wake-up phrase, read an open sound audio segment, corresponding to the selection, from a storage device; and

output, through a speaker, the open sound audio segment indicating that a system is receptive to capturing the user'"'"'s speech,wherein the user is able to distinguish between at least two speech enabled devices within a shared audible environment.

View all claims

9 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems for speech enabling devices perform methods of configuring distinct open sounds for different devices to indicate to users when each device is recognizing speech. Open sounds are stored both on computer-readable media within a device and on server systems to which devices interface over networks. Open sounds are a parameter of device personalities, and can be configured by system designers, users, or service providers. Devices detect the presence of others by spotting known open phrases, and provide distinctiveness by changing their selected open phrase. Server system providers analyze non-verbal and spoken phrase open sounds from developers using audio fingerprinting and speech recognition.

97 Citations

View as Search Results

14 Claims

1. A non-transitory computer readable medium storing code that, when executed by one or more processors would cause the one or more processors to:
- receive input indicative of a selection of one of a plurality of distinguishable open sounds to be used for indicating that a system is receptive to a user query;
  
  capture audio through a microphone;
  
  digitize the audio into audio samples;
  
  perform sound spotting using a neural network algorithm on the audio samples, the neural network trained for a specific wake-up phrase;
  
  in response to the neural network spotting the specific wake-up phrase, receive speech input through the microphone, the speech input including an audible user query;
  
  further in response to spotting the specific wake-up phrase, read an open sound audio segment, corresponding to the selection, from a storage device; and
  
  output, through a speaker, the open sound audio segment indicating that a system is receptive to capturing the user'"'"'s speech,wherein the user is able to distinguish between at least two speech enabled devices within a shared audible environment.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The non-transitory computer readable medium of claim 1, wherein the code, when executed by one or more processors, would cause the one or more processors to:
    - receive an end of utterance input indicating an end of utterance; and
      
      responsive to receiving the end of utterance input, read a close sound audio segment corresponding to the selection.
  - 3. The non-transitory computer readable medium of claim 1, wherein the input indicative of the selection is an input from the user.
  - 4. The non-transitory computer readable medium of claim 1, wherein the input indicative of a selection is also indicative of a selection of at least one of a plurality of wake-up phrases.
  - 5. The non-transitory computer readable medium of claim 1, wherein the code, when executed by one or more processors, would cause the one or more processors to:
    - receive an audio signal; and
      
      compare the audio signal to at least one alternative open sound audio segment,wherein the input indicative of the selection is conditioned upon not matching the audio signal to the at least one alternative open sound audio segments.
  - 6. The non-transitory computer readable medium of claim 1, wherein the code, when executed by one or more processors, would cause the one or more processors to:
    - receive ambient sound;
      
      compute loudness of the ambient sound; and
      
      adjust volume of the open sound audio segment output in response to the loudness of the ambient sound.
  - 7. The non-transitory computer readable medium of claim 1, wherein the code, when executed by one or more processors, would cause the one or more processors to:
    - provide, to the user, a menu of names corresponding to open sounds selected from the plurality of open sounds,wherein the input indicative of a selection of one of a plurality of open sounds is the user'"'"'s selection from the menu.

8. A non-transitory computer readable medium storing code that, when executed by one or more processors would cause the one or more processors to:
- receive a client request for an open sound selected from a plurality of distinguishable open sounds, the open sound to be used as an indication that the client is receptive to a user'"'"'s query;
  
  according to an indication of which of the plurality of open sounds, read a corresponding open sound audio segment; and
  
  transmit the open sound audio segment to the client;
  
  capture audio through a microphone;
  
  digitize the audio into audio samples;
  
  perform sound spotting on the audio samples to detect a specific wake-up phrase;
  
  in response to detecting the specific wake-up phrase, output the open sound audio segment, through a speaker, indicating that the client is receptive to capturing the user'"'"'s speech,wherein the user is able to distinguish between at least two speech enabled devices within a shared audible environment.
- View Dependent Claims (9, 10, 11, 12, 13)
- - 9. The non-transitory computer readable medium of claim 8, wherein the code, when executed by the one or more processors, would also cause the one or more processors to determine the indication from the client request.
  - 10. The non-transitory computer readable medium of claim 8, wherein the code, when executed by the one or more processors, would also cause the one or more processors to:
    - store the indication; and
      
      read the indication.
  - 11. The non-transitory computer readable medium of claim 8, wherein the code, when executed by the one or more processors, would also cause the one or more processors to ensure that a plurality of types of device, each has a unique open sound audio segment.
  - 12. The non-transitory computer readable medium of claim 11, wherein the code, when executed by the one or more processors, would also cause the one or more processors to:
    - compare each of a plurality of sound audio segments;
      
      compute a difference score for each comparison; and
      
      provide a notification to a system operator responsive to the difference score being below a threshold.
  - 13. The non-transitory computer readable medium of claim 12, wherein the code, when executed by the one or more processors, would also cause the one or more processors to:
    - transcribe speech from a plurality of sound audio segments; and
      
      include the transcription in the comparison.

14. A natural language virtual assistant server system enabled to:
- receive and store at least one domain-specific natural language grammar from a first developer;
  
  receive and store at least one open sound selected from a plurality of distinguishable open sounds from the first developer;
  
  receive and store at least one domain-specific natural language grammar from a second developer;
  
  receive and store at least one open sound selected from the plurality of distinguishable open sounds from the second developer, the at least one open sound of the first developer being distinguishably different from the at least one open sound of the second developer;
  
  read and transmit the first open sound to a first device, the first device having a first wake-up phrase; and
  
  read and transmit the second open sound to a second device;
  
  capture audio through a first microphone of the first device and through a second microphone of the second device;
  
  digitize the audio into an audio sample;
  
  perform sound spotting on the audio sample at the first device and the second device, to determine if there is a match between the audio sample and at least one of the first wake-up phrase and the second wake-up phrase; and
  
  in response to determining a match between the audio sample and at least one of the first wake-up phrase and the second wake-up phrase, activate one of the first device and the second device to output through that device'"'"'s speaker the corresponding open sound indicating that the corresponding device is receptive to capturing speech,wherein a user is able to distinguish between the first device and the second device within a shared audible environment based on the device'"'"'s corresponding open sound.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Soundhound AI IP LLC
Original Assignee
SoundHound Incorporated
Inventors
Wanderlust, Moxie

Application Number

US15/339,291
Publication Number

US 20180122372A1
Time in Patent Office

Days
Field of Search
US Class Current
CPC Class Codes

G06F 3/167 Audio in a user interface, ...

G10L 15/22 Procedures used during a sp...

DISTINGUISHABLE OPEN SOUNDS

First Claim

9 Assignments

0 Petitions

Accused Products

Abstract

97 Citations

14 Claims

Specification

Solutions

Use Cases

Quick Links

DISTINGUISHABLE OPEN SOUNDS

First Claim

9 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

97 Citations

14 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links