Personal speech assistant supporting a dialog manager
First Claim
1. Apparatus for providing a portable spoken language interface for a user to a device in communication with the apparatus, the device having at least one application associated therewith, the spoken language interface apparatus comprising:
- an audio input system for receiving speech data provided by the user;
an audio output system for outputting speech data to the user;
a speech decoding engine for generating a decoded output in response to spoken utterances;
a speech synthesizing engine for generating a synthesized speech output in response to text data;
a dialog manager operatively coupled to the device, the audio input system, the audio output system, the speech decoding engine and the speech synthesizing engine, the dialog manager being configurable for supporting a plurality of applications residing concurrently in the device and allocating one or more resources associated with the audio input system, the audio output system, the speech decoding engine and the speech synthesizing enginebetween the plurality of applications; and
at least one user interface data set operatively coupled to the dialog manager, the user interface data set representing spoken language interface elements and data recognizable by the application of the device;
wherein;
(i) the dialog manager enables connection between the input audio system and the speech decoding engine such that the spoken utterance provided by the user is provided from the input audio system to the speech decoding engine;
(ii) the speech decoding engine decodes the spoken utterance to generate a decoded output which is returned to the dialog manager;
(iii) the dialog manager uses the decoded output to search the user interface data set for a corresponding spoken language interface element and data which is returned to the dialog manager when found;
(iv) the dialog manager provides the spoken language interface element associated data to the application of the device for processing in accordance therewith;
(v) the application of the device, on processing that element, provides a reference to an interface element to be spoken;
(vi) the dialog manager enables connection between the audio output system and the speech synthesizing engine such that the speech synthesizing engine which, accepting data from that element, generates a synthesized output that expresses that element; and
(vii) the audio output system audibly presenting the synthesized output to the user.
1 Assignment
0 Petitions
Accused Products
Abstract
A Personal Speech Assistant (PSA) is a computing apparatus which provides a spoken language interface to another apparatus to which it is attached by supporting execution of a conversational dialog manager and its supporting service engines. In operation, a PSA is connected to a device which provides some service to a user. Any “appliance” is a candidate for enhancement with the PSA. Devices such as, for example, video cassette recorders (VCRs) or Personal Digital Assistants (PDAs), which offer rich, but frequently difficult interfaces, may be made more useful by the integration of a PSA according to the invention. It is a preferred feature of a dialog manager used by the PSA that the user interface properties, in terms of the vocabulary the device understands, the informative prompts it provides, and other aspects of its conversational behavior, are all easily modified to correspond to the preferences or limitations of the user.
119 Citations
15 Claims
-
1. Apparatus for providing a portable spoken language interface for a user to a device in communication with the apparatus, the device having at least one application associated therewith, the spoken language interface apparatus comprising:
-
an audio input system for receiving speech data provided by the user;
an audio output system for outputting speech data to the user;
a speech decoding engine for generating a decoded output in response to spoken utterances;
a speech synthesizing engine for generating a synthesized speech output in response to text data;
a dialog manager operatively coupled to the device, the audio input system, the audio output system, the speech decoding engine and the speech synthesizing engine, the dialog manager being configurable for supporting a plurality of applications residing concurrently in the device and allocating one or more resources associated with the audio input system, the audio output system, the speech decoding engine and the speech synthesizing enginebetween the plurality of applications; and
at least one user interface data set operatively coupled to the dialog manager, the user interface data set representing spoken language interface elements and data recognizable by the application of the device;
wherein;
(i) the dialog manager enables connection between the input audio system and the speech decoding engine such that the spoken utterance provided by the user is provided from the input audio system to the speech decoding engine;
(ii) the speech decoding engine decodes the spoken utterance to generate a decoded output which is returned to the dialog manager;
(iii) the dialog manager uses the decoded output to search the user interface data set for a corresponding spoken language interface element and data which is returned to the dialog manager when found;
(iv) the dialog manager provides the spoken language interface element associated data to the application of the device for processing in accordance therewith;
(v) the application of the device, on processing that element, provides a reference to an interface element to be spoken;
(vi) the dialog manager enables connection between the audio output system and the speech synthesizing engine such that the speech synthesizing engine which, accepting data from that element, generates a synthesized output that expresses that element; and
(vii) the audio output system audibly presenting the synthesized output to the user.- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15)
-
-
13. Apparatus for providing a portable spoken language interface for a user to a device in communication with the apparatus, the device having at least one application associated therewith, the apparatus comprising:
-
an audio input system for receiving speech data provided by the user;
an audio output system for outputting speech data to the user;
a speech decoding engine for generating a decoded output in response to spoken utterances;
a speech synthesizing engine for generating a synthesized speech output in response to text data;
a dialog manager operatively coupled to the device, the audio input system, the audio output system, the speech decoding engine and the speech synthesizing engine, the dialog manager being configurable for supporting a plurality of applications residing concurrently in the device and allocating one or more resources associated with the audio input system, the audio output system, the speech decoding engine and the speech synthesizing engine between the plurality of applications; and
at least one user interface data set operatively coupled to the dialog manager, the user interface data set representing spoken language interface elements and data recognizable by the application of the device. - View Dependent Claims (14)
-
Specification