Visualization and self-organization of multidimensional data through equalized orthogonal mapping

US 6,212,509 B1
Filed: 05/02/2000
Issued: 04/03/2001
Est. Priority Date: 09/29/1995
Status: Expired due to Term

First Claim

Patent Images

1. A system for organizing multi-dimensional pattern data into a reduced-dimension representation comprising:

a neural network comprised of a plurality of layers of nodes, the plurality of layers including;

an input layer comprised of a plurality of input nodes, a hidden layer, and an output layer comprised of a plurality of non-linear output nodes, wherein the number of non-linear output nodes is less than the number of input nodes;

receiving means for receiving multi-dimensional pattern data into the input layer of the neural network;

output means for generating an output signal for each of the output nodes of the output layer of the neural network corresponding to received multi-dimensional pattern data; and

training means for completing a training of the neural network, wherein the training means includes means for equalizing and orthogonalizing the output signals of the output nodes by reducing a covariance matrix of the output signals to the form of a diagonal matrix.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The subject system provides reduced-dimension mapping of pattern data. Mapping is applied through conventional single-hidden-layer feed-forward neural network with non-linear neurons. According to one aspect of the present invention, the system functions to equalize and orthogonalize lower dimensional output signals by reducing the covariance matrix of the output signals to the form of a diagonal matrix or constant times the identity matrix. The present invention allows for visualization of large bodies of complex multidimensional data in a relatively “topologically correct” low-dimension approximation, to reduce randomness associated with other methods of similar purposes, and to keep the mapping computationally efficient at the same time.

44 Citations

View as Search Results

16 Claims

1. A system for organizing multi-dimensional pattern data into a reduced-dimension representation comprising:
- a neural network comprised of a plurality of layers of nodes, the plurality of layers including;
  
  an input layer comprised of a plurality of input nodes, a hidden layer, and an output layer comprised of a plurality of non-linear output nodes, wherein the number of non-linear output nodes is less than the number of input nodes;
  
  receiving means for receiving multi-dimensional pattern data into the input layer of the neural network;
  
  output means for generating an output signal for each of the output nodes of the output layer of the neural network corresponding to received multi-dimensional pattern data; and
  
  training means for completing a training of the neural network, wherein the training means includes means for equalizing and orthogonalizing the output signals of the output nodes by reducing a covariance matrix of the output signals to the form of a diagonal matrix.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. A system according to claim 1, wherein said training means uses backpropagation to iteratively update weights for the links between nodes of adjacent layers.
  - 3. A system according to claim 2, wherein said weights are generated randomly in the interval (W, −
    - W).
  - 4. A system according to claim 3, wherein averaged variance of all dimensions of the multi-dimensional pattern data is:
    - $V_{i n} = \frac{1}{SP} \sum_{i = 1}^{S} \sum_{p = 1}^{P} {(x_{ip} - 〈 x_{i} 〉)}^{2},$
5. A system according to claim 4, wherein weights Δ
- w_kjbetween the hidden layer and the output layer are iteratively updated according to the expression;
  
  $\begin{matrix} Δ w_{kj} = - η \frac{\partial E}{\partial w_{kj}} \\ = \frac{1}{P} \sum_{p = 1}^{P} η δ_{kp} O_{jp} \\ = - η (\frac{\partial E_{kk}}{\partial w_{kj}} + \sum_{k_{2} = k + 1}^{K} \frac{\partial E_{{kk}_{2}}}{\partial w_{kj}} + \sum_{k_{1} = 1}^{k - 1} \partial \frac{E_{k_{1} k}}{\partial w_{kj}}) \\ = Δ w_{kj, 1} + Δ w_{kj, 2} + Δ w_{kj, 3}, \end{matrix}$ where η
  
  is a constant of suitable value chosen to provide efficient convergence but to avoid oscillation;
  
  O_jpis the output signal from the jth node in the layer preceeding the output layer due to the pth input data pattern vector;
  
  E is the error given by;
  
  $E = \sum_{k_{1} = 1}^{K} \sum_{k_{2} = k_{1}}^{K} E_{k_{1} k_{2}}$ $and, E_{k_{1} k_{2}} = {(\frac{V_{out, kk} - r_{kk} V_{i n}}{r_{kk} V_{i n}})}^{2},$ where k₁=k₂=k;
  
  k=1, . . . , K; and
  
  r_kkis a positive constant which has an effect of increasing the speed of training, $E_{k_{1} k_{2}} = {(\frac{V_{out, k_{1} k_{2}}}{r_{k_{1} k_{2}}} V_{i n})}^{2},$ where k₂>
  
  k₁;
  
  k₁=1, . . . , K; and
  
  r_k₁_k₂is a positive constant which has an effect of increasing the speed of training; and
  
  δ
  
  _kp=δ
  
  _kp,1+δ
  
  _kp,2+δ
  
  _kp,3, where δ
  
  _kpis a value proportional to the contribution to the error E by the outputs of the kth node of the output layer, for the pth input data pattern vector, and δ
  
  _kp,1, δ
  
  _kp,2, and δ
  
  _kp,3are components of δ
  
  _kp.
6. A system according to claim 5, wherein:
- $Δ w_{kj, 1} = - η \frac{\partial E_{kk}}{\partial w_{kj}} = \frac{1}{P} \sum_{p = 1}^{P} η δ_{kp, 1} O_{jp}$ $Δ w_{kj, 2} = - η \sum_{k_{2} = k + 1}^{K} \frac{\partial E_{{kk}_{2}}}{\partial w_{kj}} = \frac{1}{P} \sum_{p = 1}^{P} η δ_{kp, 2} O_{jp}$ $Δ w_{kj, 3} = - η \sum_{k_{1} = 1}^{k - 1} \frac{\partial E_{k_{1} k}}{\partial w_{kj}} = \frac{1}{P} \sum_{p = 1}^{P} η δ_{kp, 3} O_{jp}$ whereΔ
  
  w_kj,1is the contribution from the diagonal terms of the covariance matrix of the outputs, Δ
  
  w_kj,2is the contribution from the off-diagonal terms in kth row, Δ
  
  w_kj,3is the contribution from the off-diagonal terms in kth column, and O_jpis the output signal from the jth node in the layer preceeding the output layer for the pth input data pattern vector.
7. A system according to claim 6, wherein:
- $δ_{kp, 1} = 4 (V_{out, kk} - r_{kk} V_{i n}) (〈 O_{k} 〉 - O_{kp}) O_{kp} (1 - O_{kp})$ $δ_{kp, 1} = 2 (\sum_{k_{2} = k + 1}^{K} V_{out, {kk}_{2}} (〈 O_{k} 〉 - O_{kp})) O_{kp} (1 - O_{kp})$ $δ_{kp, 3} = 2 (\sum_{k_{1} = 1}^{k - 1} V_{out, k_{1} k} (〈 O_{k} 〉 - O_{kp})) O_{kp} (1 - O_{kp}),$ whereO_kpis the output signal from the kth node in the output layer for the pth input data pattern vector, and <
  
  O_kp>
  
  is the average of O_kpevaluated over the set of input data pattern vectors.
8. A system according to claim 5, wherein backpropogation of error to the weights Δ
- w_jibetween the jth node in a layer of nodes and the ith node in its'"'"' preceeding layer;
  
  $Δ w_{ji} = η \frac{\partial E}{\partial w_{ji}} = \frac{1}{P} \sum_{p = 1}^{P} η δ_{jp} x_{ip}$ where, δ
  
  _jpis given by;
  
  $δ_{jp} = (\sum_{k = 1}^{K} δ_{kp} w_{kj}) O_{jp} (1 - O_{jp}) .$

9. A method for effecting the organization of multi-dimensional pattern data into a reduced dimensional representation using a neural network having an input layer comprised of a plurality of input nodes, a hidden layer, and an output layer comprised of a plurality of non-linear output nodes, wherein the number of non-linear output nodes is less than the number of input nodes, said method comprising:
- receiving multi-dimensional pattern data into the input layer of the neural network;
  
  generating an output signal for each of the output nodes of the neural network corresponding to received multi-dimensional pattern data; and
  
  training the neural network by equalizing and orthogonalizing the output signals of the output nodes by reducing a covariance matrix of the output signals to the form of a diagonal matrix.
- View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
- - 10. A method according to claim 9, wherein said step of training includes backpropagation to iteratively update weights for links between nodes of adjacent layers.
  - 11. A method according to claim 10, wherein said weights are generated randomly in the interval (W, −
    - W).
  - 12. A method according to claim 11, wherein averaged variance of all dimensions of the multi-dimensional pattern data is:
    - $V_{i n} = \frac{1}{SP} \sum_{i = 1}^{S} \sum_{p = 1}^{P} {(x_{ip} - 〈 x_{i} 〉)}^{2},$
13. A method according to claim 12, wherein weights Δ
- w_kjbetween the hidden layer and the output layer are iteratively updated according to the expression;
  
  $\begin{matrix} Δ w_{kj} = - η \frac{\partial E}{\partial w_{kj}} \\ = \frac{1}{P} \sum_{p = 1}^{P} η δ_{kp} O_{jp} \\ = - η (\frac{\partial E_{kk}}{\partial w_{kj}} + \sum_{k_{2} = k + 1}^{K} \frac{\partial E_{{kk}_{2}}}{\partial w_{kj}} + \sum_{k_{1} = 1}^{k - 1} \partial \frac{E_{k_{1} k}}{\partial w_{kj}}) \\ = Δ w_{kj, 1} + Δ w_{kj, 2} + Δ w_{kj, 3}, \end{matrix}$ where η
  
  is a constant of suitable value chosen to provide efficient convergence but to avoid oscillation;
  
  O_jpis the output signal from the jth node in the layer preceeding the output layer, due to the pth input data pattern vector;
  
  E is the error given by;
  
  $E = \sum_{k_{1} = 1}^{K} \sum_{k_{2} = k_{1}}^{K} E_{k_{1} k_{2}}$ $and, E_{k_{1} k_{2}} = {(\frac{V_{out, k, k} - r_{kk} V_{in}}{r_{kk} V_{in}})}^{2},$ where k₁=k₂=k;
  
  k=1, . . . , K; and
  
  r_kkis a positive constant which has an effect of increasing the speed of training, $E_{k_{1} k_{2}} = {(\frac{V_{out, k_{1} k_{2}}}{r_{k_{1} k_{2}}} V_{in})}^{2},$ where k₂>
  
  k₁;
  
  k₁=1, . . . , K−
  
  1;
  
  k₂=k₁+1, . . . , K; and
  
  r_k₁_k₂is a positive constant which has an effect of increasing the speed of training; and
  
  δ
  
  _kp=δ
  
  _kp,1+δ
  
  _kp,2+δ
  
  _kp,3, where δ
  
  _kpis a value proportional to the contribution to the error E by the outputs of the kth node of the output layer, for the pth input data pattern vector, and δ
  
  _kp,1, δ
  
  _kp,2, and δ
  
  _kp,3are components of δ
  
  _kp.
14. A method according to claim 13, wherein:
- $Δ w_{kj, 1} = - η \frac{\partial E_{kk}}{\partial w_{kj}} = \frac{1}{P} \sum_{p = 1}^{P} {ηδ}_{kp, 1} O_{jp}$ $Δ w_{kj, 2} = - η \sum_{k_{2} = k + 1}^{K} \frac{\partial E_{{kk}_{2}}}{\partial w_{kj}} = \frac{1}{P} \sum_{p = 1}^{P} {ηδ}_{kp, 2} O_{jp}$ $Δ w_{kj, 3} = - η \sum_{k_{1} = 1}^{k - 1} \frac{\partial E_{k_{1} k_{2}}}{\partial w_{kj}} = \frac{1}{P} \sum_{p = 1}^{P} {ηδ}_{kp, 3} O_{jp}$ whereΔ
  
  w_kj,1is the contribution from the diagonal term, Δ
  
  w_kj,2is the contribution from the off-diagonal terms in kth row, and Δ
  
  w_kj,3is the contribution from the off-diagonal terms in kth column.
15. A method according to claim 14, wherein δ
- _kp,1, δ
  
  _kp,2and δ
  
  _kp,3are given by;
  
  $δ_{kp, 1} = 4 (V_{out, kk} - r_{kk} V_{in}) (〈 O_{k} 〉 - O_{kp}) O_{kp} (1 - O_{kp})$ $δ_{kp, 1} = 2 (\sum_{k_{2} = k + 1}^{K} V_{out, {kk}_{2}} (〈 O_{k} 〉 - O_{kp})) O_{kp} (1 - O_{kp})$ $δ_{kp, 3} = 2 (\sum_{k_{1} = 1}^{k - 1} V_{out, k_{1} k} (〈 O_{k} 〉 - O_{kp})) O_{kp} (1 - O_{kp}),$ whereO_kpis the output signal from the kth node in the layer preceeding the output layer for the pth input data pattern vector, and <
  
  O_kp>
  
  is the average of O_kpevaluated over the set of input data pattern vectors.
16. A method according to claim 13, wherein backpropogation of error to the weights Δ
- w_jibetween the jth node in a layer of nodes and the ith node in its'"'"' preceeding layer are;
  
  $Δ w_{ji} = η \frac{\partial E}{\partial w_{ji}} = \frac{1}{P} \sum_{p = 1}^{P} {ηδ}_{jp} x_{jp}$ where, δ
  
  _jpis given by;
  
  $δ_{jp} = (\sum_{k = 1}^{K} δ_{kp} w_{kj}) O_{jp} (1 - O_{jp}) .$

Specification

Resources

Litigation Campaign Assessment

Current Assignee
CA, Inc. (d/b/a CA Technologies) (Broadcom, Inc.)
Original Assignee
Computer Associates Think Inc. (Broadcom, Inc.)
Inventors
Pao, Yoh-Han, Meng, Zhuo
Primary Examiner(s)
Hafiz, Tariq R.
Assistant Examiner(s)
Starks, Wilbert L.

Application Number

US09/562,777
Time in Patent Office

336 Days
Field of Search

706/16, 706/22, 706/25, 706/62, 703/13, 707/532, 704/9, 382/156, 382/205
US Class Current

706/16
CPC Class Codes

G06F 18/213   Feature extraction, e.g. by...

G06F 18/21355   nonlinear criteria, e.g. em...

G06F 18/2137   based on criteria of topolo...

Visualization and self-organization of multidimensional data through equalized orthogonal mapping

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

44 Citations

16 Claims

Specification

Solutions

Use Cases

Quick Links

Visualization and self-organization of multidimensional data through equalized orthogonal mapping

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

44 Citations

16 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links