Developmental learning machine and method
First Claim
1. A machine having developmental learning capability, said machine comprising:
- one or more sensors for sensing an environment of the machine and generating one or more sensed signals in response thereto;
one or more effectors for acting on one or more objects in the environment;
a sensor based level builder having one or more level building elements, said sensor based level builder receiving as an input successive frames of said sensed signals one at a time, and generating action signals each having a relative probability, the sensor based level builder autonomously generating representations of tasks to be learned from said one or more sensed signals; and
a confidence accumulator for receiving said action signals and accumulating confidence of said action signals based on priority to determine most probable action signals, said confidence accumulator producing action control signals to control said one or more effectors in response to said determined most probable action signals, wherein said machine learns directly from continuous unsegmented sensory streams on-line while performing an operation and learns new tasks of unconstrained domains without a need for reprogramming, and wherein said learned new tasks include tasks that are not predetermined at the time of machine programming, said learned new tasks comprising at least two of autonomous navigation, speech recognition, and object manipulation.
1 Assignment
0 Petitions
Accused Products
Abstract
A machine and method capable of developing intelligent behavior from interaction with its environment directly using the machine'"'"'s sensors and effectors. The method described is independent of the type of sensors and actuators, or the tasks to be executed, and, therefore, provides a general purpose learner that learns while performing. It senses the world, recalls what is learned, judges what to do and acts according to what it has learned. The machine enables the machine to learn directly from sensory input streams while interacting with the environment, including human teachers. The presented approach enables the system to self-organize its internal representation, and uses a systematic way to automatically build multi-level representation using the Markov random process model. Reward and punishment are combined with sensor-based teaching to develop intelligent behavior.
113 Citations
30 Claims
-
1. A machine having developmental learning capability, said machine comprising:
-
one or more sensors for sensing an environment of the machine and generating one or more sensed signals in response thereto;
one or more effectors for acting on one or more objects in the environment;
a sensor based level builder having one or more level building elements, said sensor based level builder receiving as an input successive frames of said sensed signals one at a time, and generating action signals each having a relative probability, the sensor based level builder autonomously generating representations of tasks to be learned from said one or more sensed signals; and
a confidence accumulator for receiving said action signals and accumulating confidence of said action signals based on priority to determine most probable action signals, said confidence accumulator producing action control signals to control said one or more effectors in response to said determined most probable action signals, wherein said machine learns directly from continuous unsegmented sensory streams on-line while performing an operation and learns new tasks of unconstrained domains without a need for reprogramming, and wherein said learned new tasks include tasks that are not predetermined at the time of machine programming, said learned new tasks comprising at least two of autonomous navigation, speech recognition, and object manipulation. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A method of automatically developing learning capability with a machine, said method comprising the steps of:
-
sensing an environment with one or more sensors;
inputting successive frames of signal information into one or more sensor based level builders;
deriving action signals with said one or more sensor based level builders while no action is imposed from the environment, each of said action signals having a relative probability;
autonomously generating representations of tasks to be learned from said sensed environment;
updating memory from continuous unsegmented sensory streams on-line and complying with an action when the action is imposed from the environment, wherein new tasks of unconstrained domains are learned, and new tasks include tasks that are not predetermined at the time of machine programming, said learned new tasks comprising at least two of autonomous navigation, speech recognition, and object manipulation;
inputting said action signals to a confidence accumulator;
determining a most probable action based on said probability of said action signals received by said confidence accumulator; and
producing action control signals to control one or more effectors in response to said determined most probable action. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30)
integrating a plurality of said state output signals from a plurality of sensor based level builders to produce one or more action signals that depend on multiple sensors; and
inputting said one or more action signals with corresponding confidence to said confidence accumulator.
-
-
19. The method as defined in claim 15 further comprising the step of computing the most confident action using probability normalization and expectation.
-
20. The method as defined in claim 15 further comprising the step of automatically forming a hierarchy of y-clusters from continuous vector outputs.
-
21. The method as defined in claim 20 further comprising the step of forming a hierarchy of x-clusters from said y-clusters.
-
22. The method as defined in claim 15 further comprising the step of deleting elements that are not visited often as defined by a memory trace update.
-
23. The method as defined in claim 15, wherein said machine is capable of leaning and performing concurrently.
-
24. The method as defined in claim 15, wherein said machine forms new states recursively from previous states using any one or more of uniform and non-uniform resolution reduction and resolution retention.
-
25. The method as defined in claim 15, wherein said machine forms new states as context of certain temporal extent, thus enabling the machine to learn directly from continuous unsegmented, sensory input streams.
-
26. The method as defined in claim 15, wherein said machine allows external action imposition and reward to be applied at any time, thus enabling learning and performance to occur in an arbitrary order and to occur concurrently.
-
27. The method as defined in claim 15, wherein said machine performs action-imposed learning in developmental learning.
-
28. The method as defined in claim 15, wherein said machine performs reinforcement learning in developmental learning.
-
29. The method as defined in claim 15, wherein said machine performs communicative learning in developmental learning.
-
30. The method as defined in claim 15, wherein said learned new tasks comprise at least all of autonomous navigation, speech recognition, and object manipulation.
Specification