Neural network model for reaching a goal state

US 5,113,482 A
Filed: 06/21/1990
Issued: 05/12/1992
Est. Priority Date: 06/21/1990
Status: Expired due to Fees

First Claim

Patent Images

1. A neural network model having an input line for receiving state information for a plurality of states, and an output generator for controlling the movement of an object along a path of selected states among said plurality of states, said neural network model comprising:

a satisfaction unit, comprising;

a satisfaction index;

means for detecting a first state, wherein said first state is a current state;

first determining means for determining that said current state is a non-goal state;

first modifying means, responsive to said first determining means, for modifying said satisfaction index to indicate a reduced level of satisfaction;

second determining means for determining that said current state is a goal state;

second modifying means, responsive to said second determining means, for modifying said satisfaction index to indicate an increased level of satisfaction;

at least three action units corresponding to at least three directions of movement, each of said action units comprising;

means for increasing a randomness factor if said satisfaction index indicates a low level of satisfaction;

means for decreasing said randomness factor if said satisfaction index indicates a high level of satisfaction;

means for randomly selecting by said randomness factor a temporary weight from a temporary weight range;

means for adding a permanent weight to said temporary weight to achieve an effective weight; and

sending means for sending an indication to move said object in the direction of movement that corresponds to said action unit to said output generator if said effective weight exceeds a predetermined value.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An object, such as a robot, is located at an initial state in a finite state space area and moves under the control of the unsupervised neural network model of the invention. The network instructs the object to move in one of several directions from the initial state. Upon reaching another state, the model again instructs the object to move in one of several directions. These instructions continue until either: a) the object has completed a cycle by ending up back at a state it has been to previously during this cycle, or b) the object has completed a cycle by reaching the goal state. If the object ends up back at a state it has been to previously during this cycle, the neural network model ends the cycle and immediately begins a new cycle from the present location. When the object reaches the goal state, the neural network model learns that this path is productive towards reaching the goal state, and is given delayed reinforcement in the form of a "reward". Upon reaching a state, the neural network model calculates a level of satisfaction with its progress towards reaching the goal state. If the level of satisfaction is low, the neural network model is more likely to override what has been learned thus far and deviate from a path known to lead to the goal state to experiment with new and possibly better paths. If the level of satisfaction is high, the neural network model is much less likely to experiment with new paths. The object is guaranteed to eventually find the best path to the goal state from any starting location, assuming that the level of satisfaction does not exceed a threshold point where learning ceases.

12 Citations

View as Search Results

6 Claims

1. A neural network model having an input line for receiving state information for a plurality of states, and an output generator for controlling the movement of an object along a path of selected states among said plurality of states, said neural network model comprising:
- a satisfaction unit, comprising;
  
  a satisfaction index;
  
  means for detecting a first state, wherein said first state is a current state;
  
  first determining means for determining that said current state is a non-goal state;
  
  first modifying means, responsive to said first determining means, for modifying said satisfaction index to indicate a reduced level of satisfaction;
  
  second determining means for determining that said current state is a goal state;
  
  second modifying means, responsive to said second determining means, for modifying said satisfaction index to indicate an increased level of satisfaction;
  
  at least three action units corresponding to at least three directions of movement, each of said action units comprising;
  
  means for increasing a randomness factor if said satisfaction index indicates a low level of satisfaction;
  
  means for decreasing said randomness factor if said satisfaction index indicates a high level of satisfaction;
  
  means for randomly selecting by said randomness factor a temporary weight from a temporary weight range;
  
  means for adding a permanent weight to said temporary weight to achieve an effective weight; and
  
  sending means for sending an indication to move said object in the direction of movement that corresponds to said action unit to said output generator if said effective weight exceeds a predetermined value.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The neural network model of claim 1, wherein said first modifying means further comprises:
    - means for using a decay constant that indicates relative importance of previously selected states versus said current state as a factor in said first modifying means modifying said satisfaction index.
  - 3. The neural network model of claim 1, wherein said second modifying means further comprises:
    - means for using a decay constant that indicates relative importance of previously selected states versus said current state as a factor in said second modifying means modifying said satisfaction index.
  - 4. The neural network model of claim 1, wherein said output generator further comprises:
    - means for moving said current state of said object from said first state to a second state responsive to said sending means.
  - 5. The neural network model of claim 4, wherein each of said action units further comprises:
    - means for concluding that moving said object from said first state to said second state was better than moving said object from said first state to any previously selected state; and
      
      means, responsive to said concluding means, for storing said effective weight as said permanent weight for said first state in a permanent weight vector.
  - 6. The neural network model of claim 5, wherein said analyzing means further comprises:
    - means for incrementing a history value for said first state as said object is moved from said first state to said second state;
      
      means for incrementing a reinforcement accumulator for said first state by an amount indicative of a reward if said second state is a goal state;
      
      means for deciding that said reinforcement accumulator divided by said history value exceeds a reinforcement baseline for said first state; and
      
      means for replacing said reinforcement baseline by said reinforcement accumulator divided by said history value.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
Lynne, Kenton J.
Primary Examiner(s)
MacDonald, Allen R.

Application Number

US07/541,570
Time in Patent Office

691 Days
Field of Search

364/513, 395/906, 395/22, 395/23
US Class Current

706/23
CPC Class Codes

G06N 3/04 Architecture, e.g. intercon...

Y10S 706/906 Process plant

Neural network model for reaching a goal state

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

12 Citations

6 Claims

Specification

Use Cases

Quick Links

Others

Neural network model for reaching a goal state

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

12 Citations

6 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others