Personal speech assistant supporting a dialog manager

US 6,748,361 B1
Filed: 12/14/1999
Issued: 06/08/2004
Est. Priority Date: 12/14/1999
Status: Expired due to Fees

First Claim

Patent Images

1. Apparatus for providing a portable spoken language interface for a user to a device in communication with the apparatus, the device having at least one application associated therewith, the spoken language interface apparatus comprising:

an audio input system for receiving speech data provided by the user;

an audio output system for outputting speech data to the user;

a speech decoding engine for generating a decoded output in response to spoken utterances;

a speech synthesizing engine for generating a synthesized speech output in response to text data;

a dialog manager operatively coupled to the device, the audio input system, the audio output system, the speech decoding engine and the speech synthesizing engine, the dialog manager being configurable for supporting a plurality of applications residing concurrently in the device and allocating one or more resources associated with the audio input system, the audio output system, the speech decoding engine and the speech synthesizing enginebetween the plurality of applications; and

at least one user interface data set operatively coupled to the dialog manager, the user interface data set representing spoken language interface elements and data recognizable by the application of the device;

wherein;

(i) the dialog manager enables connection between the input audio system and the speech decoding engine such that the spoken utterance provided by the user is provided from the input audio system to the speech decoding engine;

(ii) the speech decoding engine decodes the spoken utterance to generate a decoded output which is returned to the dialog manager;

(iii) the dialog manager uses the decoded output to search the user interface data set for a corresponding spoken language interface element and data which is returned to the dialog manager when found;

(iv) the dialog manager provides the spoken language interface element associated data to the application of the device for processing in accordance therewith;

(v) the application of the device, on processing that element, provides a reference to an interface element to be spoken;

(vi) the dialog manager enables connection between the audio output system and the speech synthesizing engine such that the speech synthesizing engine which, accepting data from that element, generates a synthesized output that expresses that element; and

(vii) the audio output system audibly presenting the synthesized output to the user.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A Personal Speech Assistant (PSA) is a computing apparatus which provides a spoken language interface to another apparatus to which it is attached by supporting execution of a conversational dialog manager and its supporting service engines. In operation, a PSA is connected to a device which provides some service to a user. Any “appliance” is a candidate for enhancement with the PSA. Devices such as, for example, video cassette recorders (VCRs) or Personal Digital Assistants (PDAs), which offer rich, but frequently difficult interfaces, may be made more useful by the integration of a PSA according to the invention. It is a preferred feature of a dialog manager used by the PSA that the user interface properties, in terms of the vocabulary the device understands, the informative prompts it provides, and other aspects of its conversational behavior, are all easily modified to correspond to the preferences or limitations of the user.

119 Citations

15 Claims

1. Apparatus for providing a portable spoken language interface for a user to a device in communication with the apparatus, the device having at least one application associated therewith, the spoken language interface apparatus comprising:
- an audio input system for receiving speech data provided by the user;
  
  an audio output system for outputting speech data to the user;
  
  a speech decoding engine for generating a decoded output in response to spoken utterances;
  
  a speech synthesizing engine for generating a synthesized speech output in response to text data;
  
  a dialog manager operatively coupled to the device, the audio input system, the audio output system, the speech decoding engine and the speech synthesizing engine, the dialog manager being configurable for supporting a plurality of applications residing concurrently in the device and allocating one or more resources associated with the audio input system, the audio output system, the speech decoding engine and the speech synthesizing enginebetween the plurality of applications; and
  
  at least one user interface data set operatively coupled to the dialog manager, the user interface data set representing spoken language interface elements and data recognizable by the application of the device;
  
  wherein;
  
  (i) the dialog manager enables connection between the input audio system and the speech decoding engine such that the spoken utterance provided by the user is provided from the input audio system to the speech decoding engine;
  
  (ii) the speech decoding engine decodes the spoken utterance to generate a decoded output which is returned to the dialog manager;
  
  (iii) the dialog manager uses the decoded output to search the user interface data set for a corresponding spoken language interface element and data which is returned to the dialog manager when found;
  
  (iv) the dialog manager provides the spoken language interface element associated data to the application of the device for processing in accordance therewith;
  
  (v) the application of the device, on processing that element, provides a reference to an interface element to be spoken;
  
  (vi) the dialog manager enables connection between the audio output system and the speech synthesizing engine such that the speech synthesizing engine which, accepting data from that element, generates a synthesized output that expresses that element; and
  
  (vii) the audio output system audibly presenting the synthesized output to the user.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15)
- - 2. The apparatus of claim 1, wherein the device is a personal digital assistant.
  - 3. The apparatus of claim 1, further comprising a communications interface between the apparatus and the device.
  - 4. The apparatus of claim 3, wherein the communications interface comprises at least one of a bus connection, a serial connection, an infrared connection and a radio frequency connection.
  - 5. The apparatus of claim 1, further comprising a power management engine for affecting control of power associated with components of the apparatus.
  - 6. The apparatus of claim 1, wherein the audio input system comprises a microphone.
  - 7. The apparatus of claim 1, wherein the audio output system comprises a speaker.
  - 8. The apparatus of claim 1, wherein the audio output system comprises an earphone.
  - 9. The apparatus of claim 1, wherein the apparatus further comprises at least one state indicator for conveying a status of the apparatus to the user.
  - 10. The apparatus of claim 1, wherein the apparatus further comprises one or more additional engines for providing speech related functions, the one or more engines also being operatively coupled to the dialog manager.
  - 11. The apparatus of claim 10, wherein the one or more engines include at least one of a user verification engine, and an audio recording engine.
  - 12. The apparatus of claim 1, wherein the dialog manager, in conjunction with at least one of the audio input system, the audio output system, the speech decoding engine and the speech synthesizing engine, is operative to provide a conversational spoken language interface for supporting spoken conversations between the user and the device.
  - 15. The apparatus of claim 12, wherein the dialog manager, in conjunction with at least one of the audio input system, the audio output system, the speech decoding engine and the speech synthesizing engine, is operative to provide a conversational spoken language interface for supporting spoken conversations between the user and the device.

13. Apparatus for providing a portable spoken language interface for a user to a device in communication with the apparatus, the device having at least one application associated therewith, the apparatus comprising:
- an audio input system for receiving speech data provided by the user;
  
  an audio output system for outputting speech data to the user;
  
  a speech decoding engine for generating a decoded output in response to spoken utterances;
  
  a speech synthesizing engine for generating a synthesized speech output in response to text data;
  
  a dialog manager operatively coupled to the device, the audio input system, the audio output system, the speech decoding engine and the speech synthesizing engine, the dialog manager being configurable for supporting a plurality of applications residing concurrently in the device and allocating one or more resources associated with the audio input system, the audio output system, the speech decoding engine and the speech synthesizing engine between the plurality of applications; and
  
  at least one user interface data set operatively coupled to the dialog manager, the user interface data set representing spoken language interface elements and data recognizable by the application of the device.
- View Dependent Claims (14)
- - 14. The apparatus of claim 13, wherein the dialog manager is configurable for launching one or more applications in response to a decoded output recognized by the speech decoding engine and indicating which of the launched applications is a current application.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
Frank, David Carl, Nahamoo, David, Comerford, Liam David
Primary Examiner(s)
Dorvil, Richemond
Assistant Examiner(s)
Storm, Donald L.

Application Number

US09/460,077
Time in Patent Office

1,638 Days
Field of Search

704/275, 704/270, 704/258, 704/251, 704/231
US Class Current

704/275
CPC Class Codes

G10L 15/28 Constructional details of s...

Personal speech assistant supporting a dialog manager

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

119 Citations

15 Claims

Specification

Use Cases

Quick Links

Others

Personal speech assistant supporting a dialog manager

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

119 Citations

15 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others