Streamlined architecture for embodied conversational characters with reduced message traffic
First Claim
1. An apparatus for implementing an autonomous animated character, comprising:
- an animation system configured to control said animated character based on commands;
an action scheduler configured to, receive inputs related to at least one of said animated character and a user of said animated character, and send commands based on said inputs to said animation system to control said animated character;
a vision mechanism configured to send a location of said user to said action scheduler as one part of said inputs;
a dialogue manager configured to, receive speech input records and determine speech, actions, and gesture responses to be performed by said animated character, and provide said speech, actions, and gesture responses to said action scheduler as a second part of said inputs; and
a speech manager configured to, receive speech inputs from said user, prepare and send a speech on message to said action scheduler indicating speech inputs are being received, and convert the received speech to a speech input record and send the speech input record to said dialogue manager.
2 Assignments
0 Petitions
Accused Products
Abstract
An architecture including a speech manager that identifies input, input content, and location of the input (speech, for example), a action scheduler, a dialog manager, and an animation system provides reduced message traffic and streamlined processing for support of animated characters (conversational characters, for example). Speech recognition is provided along with location information to the action scheduler for determination of appropriate expressions for interactive behavior (looking, turn taking, etc.), and speech (or input) content is provided to a dialog manager to determine a substantive response (including speech or other content related responses) and any facial expressions or gestures related to content, but not containing content, are identified, and placed in a communication to the animation system.
-
Citations
27 Claims
-
1. An apparatus for implementing an autonomous animated character, comprising:
-
an animation system configured to control said animated character based on commands;
an action scheduler configured to, receive inputs related to at least one of said animated character and a user of said animated character, and send commands based on said inputs to said animation system to control said animated character;
a vision mechanism configured to send a location of said user to said action scheduler as one part of said inputs;
a dialogue manager configured to, receive speech input records and determine speech, actions, and gesture responses to be performed by said animated character, and provide said speech, actions, and gesture responses to said action scheduler as a second part of said inputs; and
a speech manager configured to, receive speech inputs from said user, prepare and send a speech on message to said action scheduler indicating speech inputs are being received, and convert the received speech to a speech input record and send the speech input record to said dialogue manager. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
said speech on message is prepared immediately upon receipt of said speech inputs; and
said speech on message comprises a lightweight message only indicating that speech is being received.
-
-
3. The apparatus according to claim 2, wherein said speech manager further comprises a directional microphone for determining that the speech being received is directed toward said animated character.
-
4. The apparatus according to claim 1, wherein:
-
said vision mechanism updates a memory device with said user location, said memory device accessible by said action scheduler; and
said action scheduler utilizing the location stored in said memory device to determine an appropriate direction in 3D space of gestures to be commanded of said animated character.
-
-
5. The apparatus according to claim 1, wherein:
said speech manager prepares the speech input message sent to said action scheduler by performing a speech recognition, converting the recognized speech to text, and placing the recognized speech converted to text in the speech input message.
-
6. The apparatus according to claim 5, wherein:
said dialogue manager performs deliberative processing based on said speech input record to determine said speech, actions, and gesture responses to be performed by said animated character.
-
7. The apparatus according to claim 6, wherein said speech, actions, and gesture responses sent to said action scheduler are formatted in a text markup language.
-
8. The apparatus according to claim 1, wherein said action scheduler further comprises a reactive processing mechanism configured to determine reactive responses to user speech identified by said speech on message.
-
9. The apparatus according to claim 1, wherein:
-
said speech manager is further configured to prepare and send a speech off message to said action scheduler indicating that said speech inputs have ceased; and
said action manager utilizes said speech off message to at least one of terminate a current gesture and determine a reactive response to the cessation of speech from said user.
-
-
10. The apparatus according to claim 8, wherein:
-
said action scheduler is further configured to perform scheduling between each of said reactive responses and said speech, actions, and gesture responses determined by said dialogue manager; and
present commands in said scheduled order or simultaneously to said animation system implementing each of said deliberative responses and said speech, actions, and gesture responses.
-
-
11. The apparatus according to claim 8, wherein said reactive responses are prepared in parallel with said deliberative responses prepared by said dialogue manager, and are scheduled and presented to said animation system in said commands in one of tandem and with of a priority given to said reactive responses.
-
12. The apparatus according to claim 8, wherein said action scheduler utilizes a rule based system for determining said reactive responses to be performed by said animated character.
-
13. The apparatus according to claim 12, wherein said reactive responses determined by said rule based system includes gestures of all types, including any of introductory, recognition, and turntaking gestures to be performed during conversation between said animated character and said user.
-
14. The apparatus according to claim 5, wherein:
-
processing of said speech manager and said vision system are embodied in at least one first computer program intended to run on a first network connected computer;
said animation system is embodied in a second computer program intended to run on said first network connected computer; and
each of said action scheduler and said dialog manager are embodied in at least one third computer program intended to run on a second network connected computer.
-
-
15. The apparatus according to claim 14, wherein:
-
said first and second network connected computers are Internet connected computers;
said speech manager and said vision system are Internet browser compatible applets;
said animation system is an Internet browser compatible applet; and
said second network connected computer is configured to host an Internet server configured to upload said speech manager vision system, and said animation system applets to Internet connected requesting computers, and execute said action scheduler and dialog manager computer program(s).
-
-
16. The apparatus according to claim 15, wherein:
-
one of said Internet connected computers is said first network connected computer, and said user inputs are received from devices connected to said first network connected computer and directed to said first computer program.
-
-
17. A method of controlling an animated character, comprising the steps of:
-
identifying occurrence of an input to said animated character;
preparing a lightweight record identifying said input occurrence;
transferring said lightweight record to an action scheduler;
preparing a reactive response for said animated character in response to the input occurrence identified in said lightweight record; and
transferring said reactive response to an animation system that controls said animated character; and
playing said reactive response by said animation system. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25)
preparing only an indication of a type of the input that has occurred in said lightweight record.
-
-
19. The method according to claim 18, wherein said type of input indication is an indication of at least one of a start and a stop of speech, motion, or other inputs received from input devices.
-
20. The method according to claim 17, wherein said lightweight record indicates one of a start and a stop of speech inputs directed at said animated character.
-
21. The method according to claim 17, further comprising the steps of:
-
preparing a content record of said input occurrence identifying the substance (content) of said input occurrence;
transferring said content record to a dialog manager;
preparing a detailed response based on said content record;
transferring said detailed response to said animation system; and
playing said detailed response.
-
-
22. The method according to claim 21, wherein said step of preparing a content record comprises the steps of:
-
recognizing speech directed toward said animated character, converting said recognized speech to text, and placing the recognized text in said content record.
-
-
23. The method according to claim 22, wherein said step of preparing a detailed response comprises the steps of:
-
performing deliberative processing based on said content record to determine appropriate speech, gesture, and action responses to said input occurrence; and
preparing said detailed response record identifying each of the speech, gestures, and actions determined appropriate by said deliberative processing.
-
-
24. The method according to claim 23, wherein said detailed response record comprises a markup text string where text indicates speech and escape sequences indicate any of gestures, actions, and environment commands to be played on said animation system.
-
25. The method according to claim 21, wherein:
-
said step of transferring said detailed response comprises the steps of, transferring said detailed response to said action scheduler, scheduling said detailed response along with reactive responses for play on said animation system, and transferring each of said detailed and reactive responses to said animation system according to said schedule.
-
-
26. A method comprising the steps of:
-
receiving an animated character request at a host computer from a remote computer;
uploading an animation system and a speech manager from said host computer to said remote computer;
receiving lightweight and content records from said speech manager on said remote computer;
preparing fast and detailed responses based on said lightweight and content records; and
uploading said fast and detailed responses to said animation system on said remote computer. - View Dependent Claims (27)
said remote computer is connected to said host computer via an Internet connection;
said animation system and said speech manager are each contained in one of an Internet browser compatible applet or other Internet transferrable program;
said uploading of said speech manager and said animation system applets is performed by a server program on said host computer in response to an http request from an Internet compatible browser on said remote computer;
said lightweight and content records are received by said server program via Internet protocol communications sent by said speech manager applet; and
said uploading said fast and detailed responses is performed via Internet protocol communication between said server program and said animation system applet.
-
Specification