Application focus in speech-based systems
First Claim
1. A system, comprising:
- a command service configured to;
communicate with multiple applications, communicate with an audio device, and send a command to the audio device to perform an activity for an audio application that provides audio content to be played by the audio device, wherein the command specifies an application identifier corresponding to the audio application;
control logic configured to perform acts comprising;
receiving an event message from the audio device regarding sound played by the audio device, wherein the event message specifies the application identifier corresponding to the audio application;
if the event message indicates that the sound played by the audio device is part of a speech interaction with a user, designating the audio application as being primarily active;
if the event message indicates that the sound played by the audio device is not part of a speech interaction with a user, designating the audio application as being secondarily active;
a speech recognition service configured to receive an audio signal from the audio device and to recognize user speech in the audio signal;
a language understanding service configured to determine a meaning of the user speech;
the control logic being configured to perform further actions comprising;
if there is a primarily active application among the multiple applications, requesting that the primarily active application respond to the user speech by (a) performing a first action that is indicated at least in part by the meaning of the user speech or (b) generating a first speech response to the user speech; and
if there is no primarily active application among the multiple applications and if there is a secondarily active application among the multiple applications, requesting that the secondarily active application respond to the user speech by (a) performing a second action that is indicated at least in part by the meaning of the user speech or (b) generating a second speech response to the user speech.
1 Assignment
0 Petitions
Accused Products
Abstract
A speech-based system includes an audio device in a user premises and a network-based service that supports use of the audio device by multiple applications. The audio device may be directed to play audio content such as music, audio books, etc. The audio device may also be directed to interact with a user through speech. The network-based service monitors event messages received from the audio device to determine which of the multiple applications currently has speech focus. When receiving speech from a user, the service first offers the corresponding meaning to the application, if any, that currently has primary speech focus. If there is no application that currently has primary speech focus, or if the application having primary speech focus is not able to respond to the meaning, the service then offers the user meaning to the application that currently has secondary speech focus.
198 Citations
20 Claims
-
1. A system, comprising:
-
a command service configured to;
communicate with multiple applications, communicate with an audio device, and send a command to the audio device to perform an activity for an audio application that provides audio content to be played by the audio device, wherein the command specifies an application identifier corresponding to the audio application;control logic configured to perform acts comprising; receiving an event message from the audio device regarding sound played by the audio device, wherein the event message specifies the application identifier corresponding to the audio application; if the event message indicates that the sound played by the audio device is part of a speech interaction with a user, designating the audio application as being primarily active; if the event message indicates that the sound played by the audio device is not part of a speech interaction with a user, designating the audio application as being secondarily active; a speech recognition service configured to receive an audio signal from the audio device and to recognize user speech in the audio signal; a language understanding service configured to determine a meaning of the user speech; the control logic being configured to perform further actions comprising; if there is a primarily active application among the multiple applications, requesting that the primarily active application respond to the user speech by (a) performing a first action that is indicated at least in part by the meaning of the user speech or (b) generating a first speech response to the user speech; and if there is no primarily active application among the multiple applications and if there is a secondarily active application among the multiple applications, requesting that the secondarily active application respond to the user speech by (a) performing a second action that is indicated at least in part by the meaning of the user speech or (b) generating a second speech response to the user speech. - View Dependent Claims (2, 3, 4)
-
-
5. A method, comprising:
-
providing a command to an audio device to perform an activity, wherein the command identifies a responsible application from among multiple applications; receiving an event message from the audio device regarding sound presented by the audio device, the event message identifying the responsible application; if the event message indicates that the sound is part of a user interaction, designating the responsible application as being primarily active; receiving speech captured by the audio device; determining a meaning of the speech; and if there is a primarily active application among the multiple applications that can respond to the meaning, requesting the primarily active application to respond to the meaning. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method, comprising:
-
receiving a first event message from a device regarding a first action performed by the device, the event message identifying a first responsible application from among multiple applications, wherein each of the multiple applications can respond to one or more meanings expressed by user speech; determining that the first action is part of a user interaction; designating the first responsible application as being primarily active; identifying a first meaning of first user speech; and determining that there is a primarily active application among the multiple applications that can respond to the first meaning; and selecting the primarily active application to respond to the first meaning. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
Specification