Integrated learning for interactive synthetic characters
First Claim
1. A method for training a mechanism to perform desired actions comprising, in combination,storing state data specifying the attributes of each of a plurality of different environmental states in which said mechanism can exist,storing action data specifying the attributes of each of a plurality of different actions that said mechanism may perform,storing tuple data comprising a plurality of tuples each of which specifies a given one of said environmental states, a given one of said actions, and at least one utility value indicating the likelihood of achieving a desired outcome as a result of performing said given action when said given state exists,storing current state condition data defining the attributes of the current environmental state of said mechanism;
- accepting input stimulus data and modifying said current state condition data in response to said input stimulus data,comparing said current state condition data with said tuple data to identify matching tuples which specify an environmental state corresponding to said current state condition,selecting from said matching tuples the particular tuple having the highest utility value,performing the action specified in said particular tuple if said highest utility value is greater than a specified threshold,altering said utility value in said particular tuple to record the performance of said action, andmodifying said current state condition to reflect the performance of said action.
0 Assignments
0 Petitions
Accused Products
Abstract
A practical approach to real-time learning for synthetic characters grounded in the techniques of reinforcement learning and informed by insights from animal training. The approach simplifies the learning task for characters by (a) enabling them to take advantage of predictable regularities in their world, (b) allowing them to make maximal use of any supervisory signals, and (c) making them easy to train by humans. An autonomous animated dog is described that can be trained with a technique used to train real dogs called “clicker training.”
-
Citations
4 Claims
-
1. A method for training a mechanism to perform desired actions comprising, in combination,
storing state data specifying the attributes of each of a plurality of different environmental states in which said mechanism can exist, storing action data specifying the attributes of each of a plurality of different actions that said mechanism may perform, storing tuple data comprising a plurality of tuples each of which specifies a given one of said environmental states, a given one of said actions, and at least one utility value indicating the likelihood of achieving a desired outcome as a result of performing said given action when said given state exists, storing current state condition data defining the attributes of the current environmental state of said mechanism; -
accepting input stimulus data and modifying said current state condition data in response to said input stimulus data, comparing said current state condition data with said tuple data to identify matching tuples which specify an environmental state corresponding to said current state condition, selecting from said matching tuples the particular tuple having the highest utility value, performing the action specified in said particular tuple if said highest utility value is greater than a specified threshold, altering said utility value in said particular tuple to record the performance of said action, and modifying said current state condition to reflect the performance of said action. - View Dependent Claims (2, 3, 4)
-
Specification