Speech-responsive portable speaker
First Claim
1. A portable music device comprising:
- a microphone;
a speaker;
a talk button;
a wireless communications interface configured to communicate with a speech support service server over a wide-area network;
the portable music device being configured to operate in a first mode when the portable music device is not receiving external power;
the portable music device being configured to operate in a second mode when the portable music device is receiving external power;
wherein operating in the first mode comprises;
detecting actuation of the talk button;
receiving first speech input, the first speech input including information about a first song to be played;
generating first audio data using the microphone, the first audio data corresponding to the first speech input;
sending the first audio data to the speech support service server;
receiving second audio data from the speech support service server, wherein the second audio data corresponds to the first song; and
playing the first song using the speaker;
wherein operating in the second mode comprises;
receiving second speech input, the second speech input corresponding to a trigger expression;
receiving third speech input, the third speech input including information about a second song to be played;
identifying, by the portable music device, the second song using the third speech input; and
playing the second song using the speaker.
1 Assignment
0 Petitions
Accused Products
Abstract
A portable music device may operate in response to user speech. In situations in which the music device is operating primarily from battery power, a push-to-talk (PTT) button may be used to indicate when the user is directing speech to the device. When the music device is receiving external power, the music device may continuously monitor a microphone signal to detect a user utterance of a wakeword, which may be used to indicate that subsequent speech is directed to the device. When operating from battery power, the device may send audio to a network-based support service for speech recognition and natural language understanding. When operating from external power, the speech recognition and/or natural language understanding may be performed by the music device itself.
-
Citations
20 Claims
-
1. A portable music device comprising:
-
a microphone; a speaker; a talk button; a wireless communications interface configured to communicate with a speech support service server over a wide-area network; the portable music device being configured to operate in a first mode when the portable music device is not receiving external power; the portable music device being configured to operate in a second mode when the portable music device is receiving external power; wherein operating in the first mode comprises; detecting actuation of the talk button; receiving first speech input, the first speech input including information about a first song to be played; generating first audio data using the microphone, the first audio data corresponding to the first speech input; sending the first audio data to the speech support service server; receiving second audio data from the speech support service server, wherein the second audio data corresponds to the first song; and playing the first song using the speaker; wherein operating in the second mode comprises; receiving second speech input, the second speech input corresponding to a trigger expression; receiving third speech input, the third speech input including information about a second song to be played; identifying, by the portable music device, the second song using the third speech input; and playing the second song using the speaker. - View Dependent Claims (2)
-
-
3. A portable device comprising:
-
a microphone; a talk actuator; a power detector configured to detect a first power state and a second power state of the portable device; the portable device being configured to operate in a first mode when in the first power state and a second mode when in the second power state; wherein operating in the first mode comprises; detecting actuation of the talk actuator; generating, based at least in part on the actuation of the talk actuator, first audio data corresponding to first speech input; sending the first audio data to a speech support service server that is external to the portable device; receiving second audio data from the speech support service server, wherein the second audio data is based at least in part on the first audio data; and outputting audible content corresponding to the second audio data; and wherein operating in the second mode comprises; receiving second speech input; generating third audio data corresponding to the second speech input; and analyzing the third audio data. - View Dependent Claims (4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A method, comprising:
-
operating a device in a first mode; and operating the device in a second mode; wherein operating in the first mode comprises; detecting actuation of a physical talk actuator; generating, based at least in part on the actuation of the physical talk actuator, first audio data corresponding to first speech input; and sending the first audio data to a network-accessible speech support service server, wherein the network-accessible speech support service server is configured to analyze the first audio data to recognize words of the first speech input; and wherein operating in the second mode comprises; receiving second speech input; generating second audio data corresponding to the second speech input; and analyzing the second audio data. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
Specification