Conversation management in speech recognition interfaces
First Claim
Patent Images
1. A computer apparatus programmed with a routine set of instructions for managing conversation in a speech recognition interface, said instructions being stored in a fixed medium, said computer comprising:
- means for capturing video of a non-synthesized human video actor;
means for concurrently generating a first graphical user interface for displaying said captured video in a video environment display and a second graphical user interface for displaying a computer-synthesized actor in a synthesized environment display;
means for generating an audio output interface for audibly transmitting audio information associated with said first and second graphical user interfaces; and
, means for generating an audio input interface for receiving audible information as an input for said speech recognition interface.
1 Assignment
0 Petitions
Accused Products
Abstract
Conversation management is provided in a computer speech recognition system by generating a number of graphical and audio interfaces. The system allocates video and speech for different purposes to cue a user when to speak. The system also makes use of video and speech to cue a user regarding how to speak, thereby alleviating the vocabulary problems common to many speech recognition interfaces. These techniques permit a user to be informed when and how to speak during a fairly complex situation such as an interview.
82 Citations
11 Claims
-
1. A computer apparatus programmed with a routine set of instructions for managing conversation in a speech recognition interface, said instructions being stored in a fixed medium, said computer comprising:
-
means for capturing video of a non-synthesized human video actor;
means for concurrently generating a first graphical user interface for displaying said captured video in a video environment display and a second graphical user interface for displaying a computer-synthesized actor in a synthesized environment display;
means for generating an audio output interface for audibly transmitting audio information associated with said first and second graphical user interfaces; and
,means for generating an audio input interface for receiving audible information as an input for said speech recognition interface. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
means for originating an information content in at least one of captured video and live video transfer; and
,means for originating an information content for said synthesized environment in an acted performance and text-to-speech conversion of speech from said performance.
-
-
5. The computer apparatus of claim 1, further comprising:
-
means for establishing a context for said speech recognition interface with said video environment; and
,means for providing examples of how to speak and examples of a proper vocabulary with said synthesized environment.
-
-
6. The computer apparatus of claim 1, further comprising:
-
means for providing predetermined instructions for using said speech recognition interface with said video environment; and
,means for answering questions and supplying information in response to said received audible information with said synthesized environment.
-
-
7. The computer apparatus of claim 1, further comprising:
-
means for providing audible information from said video environment in accordance with rules of human-to-human conversation in a lecture format; and
,means for providing audible information from said synthesized environment in accordance with rules of human-to-computer conversation.
-
-
8. The computer apparatus of claim 1, further comprising:
-
means for initiating new topics, taking turns from said synthesized acting performance and giving turns to said synthesized acting performance with said video environment; and
,means for taking turns from said video display, taking turns from said audio input interface, giving turns to said video display and giving turns to said audio input interface with said synthesized environment.
-
-
9. The computer apparatus of claim 1, further comprising means for administering an interview with said speech recognition interface.
-
10. The computer apparatus of claim 9, further comprising means for managing navigation through said interview with said synthesized environment.
-
11. The computer apparatus according to claim 1, wherein said captured video is one of recorded video of a non-synthesized human video actor and live transfer video of a non-synthesized human video actor.
Specification