Learning method for a neural network

US 5,748,848 A
Filed: 08/19/1996
Issued: 05/05/1998
Est. Priority Date: 08/21/1995
Status: Expired due to Term

First Claim

Patent Images

1. A learning method for training a recurrent neural network having a plurality of inputs and a plurality of outputs and at least one return line connecting an output to an input, comprising the steps of:

a) separating said at least one return line during training of the neural network and using the input connected to said return line as a training input together with the other inputs;

b) in a computer, interpreting input quantities supplied to the inputs of said neural network for training as a time series of a set of values of a variable input quantity representing respective values of the input quantity at discrete points in time;

c) in said computer identifying a statistical noise distribution of an uncorrelated noise of finite variance that has a chronological average of zero and is superimposed on the measured values;

d) in said computer generating a respective inputs values for any additional training inputs by, for each input value for each additional training input, treating the input value as a missing value in said time series, calculating a statistical missing value noise distribution according to said known noise distribution from at least one of said input quantity values neighboring the missing value in the time series and calculating said value of the missing value by replacing the missing value with at least two Monte Carlo samples of the missing value obtained according to the missing value noise distribution; and

e) training said neural network using said time series and a behavior of a technical system represented by the neural network.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In a learning method for training a recurrent neural network having a number of inputs and a number of outputs with at least one output being connected via a return line to an input, the return line is separated during training of the neural network, thereby freeing the input connected to the return line for use as an additional input during training, together with the other inputs. The additional input values, which must be estimated or predicted for supply to the thus-produced additional training inputs, are generated by treating each additional input value to be generated as a missing value in the time series of input quantities. Error distribution densities for the additional input values are calculated on the basis of the known values from the time series and their known or predetermined error distribution density, and samples are taken from this error distribution density according to the Monte Carlo method. These each lead to an estimated or predicted value whose average is introduced for the additional input value to be predicted. The method can be employed for the operation as well as for the training of the neural network, and is suitable for use in all known fields of utilization of neural networks.

Citations

13 Claims

1. A learning method for training a recurrent neural network having a plurality of inputs and a plurality of outputs and at least one return line connecting an output to an input, comprising the steps of:
- a) separating said at least one return line during training of the neural network and using the input connected to said return line as a training input together with the other inputs;
  
  b) in a computer, interpreting input quantities supplied to the inputs of said neural network for training as a time series of a set of values of a variable input quantity representing respective values of the input quantity at discrete points in time;
  
  c) in said computer identifying a statistical noise distribution of an uncorrelated noise of finite variance that has a chronological average of zero and is superimposed on the measured values;
  
  d) in said computer generating a respective inputs values for any additional training inputs by, for each input value for each additional training input, treating the input value as a missing value in said time series, calculating a statistical missing value noise distribution according to said known noise distribution from at least one of said input quantity values neighboring the missing value in the time series and calculating said value of the missing value by replacing the missing value with at least two Monte Carlo samples of the missing value obtained according to the missing value noise distribution; and
  
  e) training said neural network using said time series and a behavior of a technical system represented by the neural network.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
- - 2. A method according to claim 1, wherein step (d) comprises obtaining a plurality of Monte Carlo samples are taken for the missing value and determining values thereof by calculating an arithmetic average from all predicted values determined over the said samples.
  - 3. A method according to claim 1, wherein a value for a first of two missing and immediately neighboring input quantity values of the time series is generated first and a value for a second of said two missing and immediately neighboring input quantity values is generated thereafter using the value generated first.
  - 4. A method according to claim 1 comprising repeating steps (a) through (b) multiple times.
  - 5. (Amended) A method according to claim 1, wherein step e) includes in a back-propagation learning step, determining a learning step width for the input quantities of the neural network normed to one by dividing the plurality of Monte Carlo samples by 0.1.
  - 6. A method according to claim 1, where in step (b) comprises using a Gaussian distribution as said statistical noise distribution .
  - 7. A method according to claim 1 wherein step (a) comprises forming said a time series with form:
    - space="preserve" listing-type="equation">Y.sub.t =f(Y.sub.t-1,Y.sub.t-2, . . . ,Y.sub.t-N)+ε
      
      .sub.t
      wherein ε
      
      _t is said statistical noise distribution, y are values of the time series, _t Y is said missing value for which said value is to be generated by the neural network, the function f is internally available to the neural network, and wherein the statistical error distribution density is determined in step (b) as
      space="preserve" listing-type="equation">P.sub.ε
      
      (Y.sub.t-1-f (Y.sub.t-1,Y.sub.t-2, . . . ,Y.sub.t-N))=P(Y.sub.t |Y.sub.t-1,Y.sub.t-1,Y.sub.t-2, . . . ,Y.sub.t-N)
      from which Monte Carlo samples Y_t-k^l, . . . ,^ys_t-k are taken and said value of the missing value to be generated by the neural network with said samples is calculated;
      
      ##EQU9## wherein Y_t-k are the missing value in the time series, k≦
      
      N,m represents all known values of the time series, and S is the number of samples.
  - 8. A method according to claim 7, wherein said function f is moldeled by the neural network.
  - 9. A method according to claim 7, wherein the function f is stored in a memory accessible the neural network.
  - 10. A method according to claim 7 comprising the additional step of training said neural network using said time series and a behavior of a technical system represented by the neural network, including training the neural network with at least one generated value according to a learning function:
    - ##EQU10## wherein w represents neuron weighting , L is a logarithmic probability function, η
      
      is a learning factor, and wherein ##EQU11## with NN_W values of the function from the neural network, and employing values of the time series for Y_l^s and, when a value is not present,
      space="preserve" listing-type="equation">P.sup.M (Y.sup.l |Y.sup.m)
      Monte Carlo samples are obtained from the probability distribution density.
  - 11. A method according to claim 7 comprising the additional step of training said neural network using said time series and a behavior of a technical system represented by the neural network, including training the neural network with at least one generated value according to a learning function:
    - ##EQU12## wherein w represents neuron weighting, L is a logarithmic probability function, η
      
      is a learning factor, and wherein ##EQU13## wherein NN_W are values of the function from the neural network, Y^s₁ are values of the time series and wherein
      space="preserve" listing-type="equation">P.sup.M (Y.sup.l |Y.sup.m)
      Monte Carlo samples are obtained from the probability distribution density, with m representing all known measured values of the time series.
  - 12. A method according to claims 11 comprising training the neural network with a learning rule:
    - ##EQU14## wherein w represents neuron weighting, L is a logarithmic probability function, η
      
      is a learning factor, and wherein ##EQU15##
  - 13. A method according to claim 1, wherein the statistical noise distribution of the input quantity values is unknown and the input quantity values are superimposed with further noise having a statistical noise distribution which is known, wherein step (a) comprises forming said time series with a form:
    - space="preserve" listing-type="equation">z.sub.t =Y.sub.t +δ
      
      =f(Y.sub.t-1,Y.sub.t-2, . . . ,Y.sub.t-N)+ε
      
      .sub.t
      wherein ε
      
      _t is the unknown statistical noise distribution, δ
      
      ;
      
      is the known statistical noise distribution, y;
      
      are the values of the time series, Y_t ;
      
      is the value to be generated by the neural network, and wherein the statistical error distribution density is determined as;
      space="preserve" listing-type="equation">P.sub.ε
      
      (Y.sub.t-1 -f(Y.sub.t-1,Y.sub.t-2, . . . ,Y.sub.t-N))=P(Y.sub.t |Y.sub.t-1,Y.sub.t-2, . . . ,Y.sub.t-N)
      and the overall probability density over the time series is determined with;
      
      ##EQU16## and wherein step (c) comprises generating said value for said missing value by the neural network derived from at least one edited value as and obtaining Monte Carlo samples for
      space="preserve" listing-type="equation">P(Y.sub.t-1, . . . ,Y.sub.t-N |Z).

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Siemens AG
Original Assignee
Siemens AG
Inventors
Tresp, Volker
Primary Examiner(s)
Hafiz, Tariq R.

Application Number

US08/699,329
Time in Patent Office

624 Days
Field of Search

395/23, 395/11, 395/24, 395/22, 395/20-25, 395/27, 382/155-161
US Class Current

706/25
CPC Class Codes

G05B 13/027 using neural networks only

G06N 3/08 Learning methods

Learning method for a neural network

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

13 Claims

Specification

Solutions

Use Cases

Quick Links

Learning method for a neural network

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

13 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links