Stable adaptive control using critic designs

US 6,532,454 B1
Filed: 09/23/1999
Issued: 03/11/2003
Est. Priority Date: 09/24/1998
Status: Expired due to Fees

First Claim

Patent Images

1. In a computer program product including a computer storage medium, the improvement comprising a computer program code mechanism embedded in the computer storage medium for causing a microprocessor to implement a heuristic dynamic programming controller controlling an external system, the computer program code mechanism comprising:

a first computer code device configured to train a deterministic forecasting model f′

(R(t)) to predict the expectation value of R(t+1);

a second computer code device configured to calculate an error function, e, based on sampled value of Ĵ

(t+1); and

a third computer code device configured to calculate ∇

_te′

by backpropagating through;

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Classical adaptive control proves total-system stability for control of linear plants, but only for plants meeting very restrictive assumptions. Approximate Dynamic Programming (ADP) has the potential, in principle, to ensure stability without such tight restrictions. It also offers nonlinear and neural extensions for optimal control, with empirically supported links to what is seen in the brain. However, the relevant ADP methods in use today—TD, HDP, DHP, GDHP—and the Galerkin-based versions of these all have serious limitations when used here as parallel distributed real-time learning systems. Either they do not possess quadratic unconditional stability or they lead to incorrect results in the stochastic case. (ADAC or Q-learning designs do not help.) The present invention describes new ADP designs which overcome these limitations. It also addresses the Generalized Moving Target problem, a common family of static optimization problems, and describes a way to stabilize large-scale economic equilibrium models, such as the old long-term energy model of DOE.

Citations

5 Claims

1. In a computer program product including a computer storage medium, the improvement comprising a computer program code mechanism embedded in the computer storage medium for causing a microprocessor to implement a heuristic dynamic programming controller controlling an external system, the computer program code mechanism comprising:
- a first computer code device configured to train a deterministic forecasting model f′
  
  (R(t)) to predict the expectation value of R(t+1);
  
  a second computer code device configured to calculate an error function, e, based on sampled value of Ĵ
  
  (t+1); and
  
  a third computer code device configured to calculate ∇
  
  _te′
  
  by backpropagating through;
- View Dependent Claims (2)
- - 2. In the computer program product as claimed in claim 1, the improvement further comprising:
    - a fourth code device configured to calculate;

3. In a computer program product including a computer storage medium, the improvement comprising a computer program code mechanism embedded in the computer storage medium for causing a microprocessor to implement a heuristic dynamic programming controller controlling an external system, the computer program code mechanism comprising:
- a first computer code device configured to train a deterministic forecasting model f′
  
  (R(t)) to predict the expectation value of R(t+1);
  
  a second computer code device configured to obtain a value R(t+1) at each time t+1;
  
  a third computer code device configured to backpropagated an error, e, through Ĵ
  
  (R(t+1)) back to weights of a critic network;
  
  a fourth computer code device configured to obtain a first set of derivatives, g1, representing a gradient of e with respect to the weights of the critic network;
  
  a fifth computer code device configured to backpropagate e′
  
  through Ĵ
  
  (f′
  
  (R(t)), back to the weights of the Critic network to form a second set of derivatives g2; and
  
  a sixth computer code device configured to adapt weights of f′
  
  so as to minimize;
- View Dependent Claims (4, 5)
- - 4. In the computer program product as claimed in claim 3, wherein the second computer code device further comprises a seventh computer code device configured to obtain R(t+1) by simulating a random vector w inserted into a stochastic model f(R(t),w).
  - 5. In the computer program product as claimed in claim 3, wherein the second computer code device further comprises a seventh computer code device configured to obtain R(t+1) by actually observing R(t+1) and estimating w.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
IPU Power Management, LLC
Original Assignee
Paul J. Werbos
Inventors
Werbos, Paul J.
Primary Examiner(s)
Follansbee, John A.
Assistant Examiner(s)
Hirl, Joseph P.

Application Number

US09/404,197
Time in Patent Office

1,265 Days
Field of Search

706/14, 706/21, 706/23, 706/19, 706/25
US Class Current

706/14
CPC Class Codes

G05B 13/027   using neural networks only

G05B 13/048   using a predictor

G06N 20/00   Machine learning

Stable adaptive control using critic designs

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

5 Claims

Specification

Solutions

Use Cases

Quick Links

Stable adaptive control using critic designs

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

5 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links