Neural network element with reinforcement/attenuation learning

US 7,664,714 B2
Filed: 10/21/2005
Issued: 02/16/2010
Est. Priority Date: 10/21/2004
Status: Expired due to Fees

First Claim

Patent Images

1. An action learning control system capable of learning an input-output relationship according to own action of the system, comprising:

a sensor configured to obtain information from an external environment and to output the obtained information;

a sensory evaluation module configured to receive information from the sensor to receive an action policy, to determine whether a state of a controlled object is stable or not based on the received information, and to output a reinforcement signal according to the determined result;

a sensor information state separating module for performing reinforcement learning, configured to receive information from the sensor, to receive the reinforcement signal from the sensory evaluation module, to receive the action policy, to give heavier weight to sensor information having higher sensory evaluation, to classify sensor information into a low-dimensioned state, and to output the state;

an action learning module, configured to receive the state from the sensor information state separating module and to output a corresponding action control command, for learning a relationship between the state and the action control command;

an attention controller configured to receive information from the sensor, to receive the reinforcement signal from the sensory evaluation module, to receive the action control command from the action learning module, and to send the action policy to the sensory evaluation module and to the sensor information state separating module;

an action sequence storing and refining module configured to receive information from the sensor, to receive the reinforcement signal from the sensory evaluation module, to receive the action control command from the action learning module, to determine a refined action control command based on the received sensor information and based on the received action control command and based on stored temporal information, and to output the refined action control command; and

an output module configured to receive the refined action control command from the action sequence storing and refining module and to output the refined action control command.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A neural network element, outputting an output signal in response to a plurality of input signals, comprises a history memory for accumulating and storing the plurality of input signals in a temporal order as history values. It also includes an output module for outputting the output signal when an internal state exceeds a predetermined threshold value, the internal state being based on a sum of the product of a plurality of input signals and corresponding coupling coefficients. The history values depend on change of the internal state. The neural network element is configured to subtract a predetermined value from the internal state immediately after the output module fires and performs learning for reinforcing or attenuating the coupling coefficient according to the history values after the output module fires.

Citations

3 Claims

1. An action learning control system capable of learning an input-output relationship according to own action of the system, comprising:
- a sensor configured to obtain information from an external environment and to output the obtained information;
  
  a sensory evaluation module configured to receive information from the sensor to receive an action policy, to determine whether a state of a controlled object is stable or not based on the received information, and to output a reinforcement signal according to the determined result;
  
  a sensor information state separating module for performing reinforcement learning, configured to receive information from the sensor, to receive the reinforcement signal from the sensory evaluation module, to receive the action policy, to give heavier weight to sensor information having higher sensory evaluation, to classify sensor information into a low-dimensioned state, and to output the state;
  
  an action learning module, configured to receive the state from the sensor information state separating module and to output a corresponding action control command, for learning a relationship between the state and the action control command;
  
  an attention controller configured to receive information from the sensor, to receive the reinforcement signal from the sensory evaluation module, to receive the action control command from the action learning module, and to send the action policy to the sensory evaluation module and to the sensor information state separating module;
  
  an action sequence storing and refining module configured to receive information from the sensor, to receive the reinforcement signal from the sensory evaluation module, to receive the action control command from the action learning module, to determine a refined action control command based on the received sensor information and based on the received action control command and based on stored temporal information, and to output the refined action control command; and
  
  an output module configured to receive the refined action control command from the action sequence storing and refining module and to output the refined action control command.
- View Dependent Claims (2)
- - 2. The action learning control system according to claim 1, wherein the sensory evaluation module is further configured to output a command to inhibit an action control command output from the system when the sensory evaluation module judges a state of the controlled object is unstable.

3. A computer-implemented method of determining an action control command using reinforcement learning, comprising:
- obtaining information from an external environment using one or more sensors;
  
  determining whether a state of a controlled object is stable or not based on information from the sensors;
  
  outputting a reinforcement signal according to the stability determination;
  
  generating an action policy;
  
  adjusting the state separation and the reinforcement signal generation based on the action policy;
  
  performing reinforcement learning based on the reinforcement signal, comprising;
  
  giving heavier weight to sensor information having higher sensory evaluation; and
  
  classifying sensor information into a low-dimensioned state;
  
  learning a relationship between the classified state and a corresponding action control command based on the reinforcement signal;
  
  outputting a first action control command;
  
  storing and modifying an action sequence;
  
  determining a second action control command based on the obtained sensor information and based on the first action control command and based on stored temporal information; and
  
  outputting the second action control command.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Honda Motor Co., Ltd. (Honda Motor Company)
Original Assignee
Honda Motor Co., Ltd. (Honda Motor Company), Riken
Inventors
Miyakawa, Nobuaki, Matsumoto, Gen, Noyori, legal representative, Tsujino, Hiroshi
Primary Examiner(s)
Vincent; David R
Assistant Examiner(s)
Wong; Lut

Application Number

US11/255,895
Publication Number

US 20060184465A1
Time in Patent Office

1,579 Days
Field of Search

706/15
US Class Current

706/15
CPC Class Codes

G06N 3/049 Temporal neural networks, e...

G06N 3/08 Learning methods

Neural network element with reinforcement/attenuation learning

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

3 Claims

Specification

Solutions

Use Cases

Quick Links

Neural network element with reinforcement/attenuation learning

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

3 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links