Neural network and system

0Associated
Cases 
0Associated
Defendants 
0Accused
Products 
26Forward
Citations 
0
Petitions 
1
Assignment
First Claim
1. An information processor, comprising:
 (a) an encoder, said encoder encoding input information at least partially into a position type code format; and
(b) a clustering neural network providing an output therefrom, said clustering network having an input receiving said output of said encoder and said output of said clustering network, said clustering neural network clustering said input thereto with thresholded analog neurons in conjunction with a learning rule and a recall rule to preserve analog aspects.
1 Assignment
0 Petitions
Accused Products
Abstract
Neural network systems (100) with learning and recall are applied to clustered multiplefeatured data (122, 124, 126) and analog data.
26 Citations
Optical pattern recognition using detector and locator neural networks  
Patent #
US 5,239,593 A
Filed 04/03/1991

Current Assignee
Verizon Patent and Licensing Incorporated

Sponsoring Entity
NYNEX Corp.

Physics based neural network for isolating faults  
Patent #
US 20040064427A1
Filed 09/30/2002

Current Assignee
United Technologies Corporation

Sponsoring Entity
United Technologies Corporation

Physics based neural network for validating data  
Patent #
US 20040064426A1
Filed 09/30/2002

Current Assignee
United Technologies Corporation

Sponsoring Entity
United Technologies Corporation

Physics based neural network  
Patent #
US 20040064425A1
Filed 09/30/2002

Current Assignee
United Technologies Corporation

Sponsoring Entity
United Technologies Corporation

Realtime waveform analysis using artificial neural networks  
Patent #
US 5,598,508 A
Filed 10/18/1991

Current Assignee
Julian M. Goldman

Sponsoring Entity
Julian M. Goldman

Automated method for the systematic interpretation of resonance peaks in spectrum data  
Patent #
US 5,623,579 A
Filed 05/17/1995

Current Assignee
Martin Marietta Energy Systems Inc.

Sponsoring Entity
Martin Marietta Energy Systems Inc.

Signal processing apparatus and learning method therefor  
Patent #
US 5,491,776 A
Filed 08/12/1994

Current Assignee
Kawasaki Steel Corporation

Sponsoring Entity
Kawasaki Steel Corporation

Method of assigning initial values of connection parameters to a multilayered neural network  
Patent #
US 5,490,236 A
Filed 08/08/1994

Current Assignee
Toru Niki

Sponsoring Entity
Toru Niki

Learning type waveform recognizer  
Patent #
US 5,497,448 A
Filed 03/23/1995

Current Assignee
Matsushita Electric Industrial Company Limited

Sponsoring Entity
Matsushita Electric Industrial Company Limited

Time series association learning  
Patent #
US 5,440,661 A
Filed 01/31/1990

Current Assignee
United States Department of Energy

Sponsoring Entity
United States Department of Energy

Neural network learning system inferring an inputoutput relationship from a set of given input and output samples  
Patent #
US 5,479,576 A
Filed 02/23/1995

Current Assignee
Sumio Watanabe, Kenji Fukumizu

Sponsoring Entity
Sumio Watanabe, Kenji Fukumizu

Intelligence information processing method  
Patent #
US 5,479,569 A
Filed 04/30/1993

Current Assignee
Mitsubishi Electric Corporation

Sponsoring Entity
Mitsubishi Electric Corporation

Neural network having an optimized transfer function for each neuron  
Patent #
US 5,280,564 A
Filed 02/18/1992

Current Assignee
Honda Giken Kogyo Kabushiki Kaisha

Sponsoring Entity
Honda Giken Kogyo Kabushiki Kaisha

Training of neural network for multisource data fusion  
Patent #
US 5,276,770 A
Filed 11/19/1992

Current Assignee
Hughes Electronics Corporation

Sponsoring Entity
Hughes Aircraft Company

Signal discrimination device using neural network  
Patent #
US 5,287,430 A
Filed 04/14/1992

Current Assignee
Mitsubishi Electric Corporation

Sponsoring Entity
Mitsubishi Electric Corporation

Spatialtemporalstructure processor for multisensor, multi scan data fusion  
Patent #
US 5,293,455 A
Filed 02/13/1991

Current Assignee
Hughes Electronics Corporation

Sponsoring Entity
Hughes Aircraft Company

Error control codeword generating system and method based on a neural network  
Patent #
US 5,293,453 A
Filed 02/13/1992

Current Assignee
Texas Instruments Inc.

Sponsoring Entity
Texas Instruments Inc.

Apparatus and method for facilitating use of a neural network  
Patent #
US 5,317,676 A
Filed 12/03/1992

Current Assignee
International Business Machines Corporation

Sponsoring Entity
International Business Machines Corporation

Optical information processing apparatus having a neural network for inducing an error signal  
Patent #
US 5,333,125 A
Filed 11/05/1993

Current Assignee
Canon Ayutthaya Limited

Sponsoring Entity
Canon Ayutthaya Limited

Radar apparatus using neural network for azimuth and elevation detection  
Patent #
US 5,345,539 A
Filed 01/12/1993

Current Assignee
SECRETARY OF STATE FOR DEFENCE IN HER BRITTANIC MAJESTYS GOVERNMENT OF THE UNITED KINGDOM OF GREAT BRITAIN AND NORTHERN IRELAND THE

Sponsoring Entity
SECRETARY OF STATE FOR DEFENCE IN HER BRITTANIC MAJESTYS GOVERNMENT OF THE UNITED KINGDOM OF GREAT BRITAIN AND NORTHERN IRELAND THE

Method of selecting characeteristics data for a data processing system  
Patent #
US 5,185,816 A
Filed 12/13/1990

Current Assignee
Sharp Corporation

Sponsoring Entity
Sharp Corporation

Binary tree switching network  
Patent #
US 5,212,587 A
Filed 12/27/1990

Current Assignee
British Telecommunications PLC

Sponsoring Entity
British Telecommunications PLC

Computation system and method using hamming distance  
Patent #
US 5,216,750 A
Filed 02/28/1991

Current Assignee
Texas Instruments Inc.

Sponsoring Entity
Texas Instruments Inc.

Intelligence information processing system  
Patent #
US 5,257,343 A
Filed 08/14/1991

Current Assignee
Mitsubishi Electric Corporation

Sponsoring Entity
Mitsubishi Electric Corporation

Adaptive fast fuzzy clustering system  
Patent #
US 5,263,120 A
Filed 04/29/1991

Current Assignee
Michael A. Bickel

Sponsoring Entity
Michael A. Bickel

Analog hardware for learning neural networks  
Patent #
US 5,056,037 A
Filed 12/28/1989

Current Assignee
The United States of America As Represented By The Secretary of Agriculture

Sponsoring Entity
The United States of America As Represented By The Secretary of Agriculture

No References
8 Claims
 1. An information processor, comprising:
(a) an encoder, said encoder encoding input information at least partially into a position type code format; and (b) a clustering neural network providing an output therefrom, said clustering network having an input receiving said output of said encoder and said output of said clustering network, said clustering neural network clustering said input thereto with thresholded analog neurons in conjunction with a learning rule and a recall rule to preserve analog aspects.  View Dependent Claims (2, 3, 4)
 5. A method of information processing, comprising the steps of:
(a) encoding said information into thermometer code format; (b) learning said encoded information with a thresholded analog neural network; (c) recalling at least some of said information from said neural network; (d) recoding said recalled information; (e) comparing said encoded information with said recoded recalled information with noncomparable encoded leading to further recall; and (f) comparing said recoded recalled information with stored information.  View Dependent Claims (6, 7)
 8. A method of processing multiplefeature information, comprising the steps of:
(a) encoding information by concatenated thermometer codes of features of said information; (b) learning without saturation said encoded information in a thresholded analog neural network; and (c) inputting said encoded information into said network and recalling processed versions of said encoded information.
1 Specification
The U.S. Government has a paidup license in this invention and the right in limited circumstances to require the patent owner to license others on reasonable terms as provided for by the terms of Contract No. F3361587C1454 awarded by the U.S. Air Force.
This application is a continuationinpart of copending U.S. patent application Ser. No. 318,038, filed Mar. 2, 1989 (Penz and Gately). Copending U.S. patent applications Ser. Nos. 032,887, filed Mar. 31, 1987 (Penz), 010,619, filed Feb. 4, 1987 (Frazier), and 057,887, filed June 1, 1987 (Provence) disclose related subject matter. All of these crossreferenced applications are assigned to the assignee of this application and are hereby incorporated by reference.
1. Field of the Invention
The present invention relates to computation, and, more particularly, to neural network information and signal processing.
2. Description of the Related Art
Avionics sensor systems typically are confronted with the problem of identifying the emitters of various electromagnetic signals (i.e., radar pulses) being received. The known sensor systems typically include an antenna, a preprocessor, a pulse buffer, a digital signal processor, and a classifier. The antenna receives signals from various emitters with various frequencies, pulse widths, pulse repetition rates, locations, and so forth. The antenna output is preprocessed to extract a set of features for each received pulse. The set of features is processed by standard signal processing methods to cluster the pulses from each perceived emitter, and the classifier compares the features of each perceived emitter with prestored data to determine its identity. All of this analysis would take place in a homing missle which would then decide which of the perceived emitters to attack.
However, the known sensor systems have the problem of extracting the emitter identity information from the mass of incoming pulses; the computing power required for standard digital signal processing in real time cannot be effectively put into every homing missle.
Attempts to understand the functioning of the human brain have led to various "neural network" models in which large numbers of neurons are interconnected. These models roughly presume each neuron exists in one of two states (quiescent and firing) with the state determined by the states of connected nuerons (if enough connected neurons are firing, then the original neuron should be in the firing state); and the thrust of the models is to perform computations such as pattern recognition with the neural networks. The models may be simple feedforward layered structures with an input layer of neurons, one or more hidden layers of neurons, and an output layer of neurons. Other models have feedback among the neurons and correspondingly more involved behavior.
J. Hopfield, Neural Networks and Physical Systems with Emergent Collective Computational Abilities, 79 Proc. Natl. Acad. Sci. U.S.A. 2554 (1982) describes a neural network model with N neurons each of which has the value 0 or 1 (corresponding to quiescent and to firing), so the state of the network is then a Ncomponent vector V=[V_{1},V_{2}, . . . , V_{N} ] of 0'"'"'s and 1'"'"'s which depends upon time. The neuron interconnections are described by a matrix T_{i},j defining the influence of the j^{th} neuron on the i^{th} neuron. The state of the network evolves in time as follows: each neuron i has a fixed threshold θ_{i} and readjusts its state V_{i} randomly in time by setting V_{i} equal to 0 or 1 depending on whether ##EQU1## is negative or positive. All neurons have the same average rate of readjustment, and the readjustments define a dynamic flow in state space.
With the assumption that T_{i},j is symmetric, the potential function ##EQU2## can be used to show that the flow of the network is to local minima of the potential function. Further, with a given set of uncorrelated Ncomponent vectors U^{1},U^{2}, . . . ,U^{s}, a T_{i},j can be defined by ##EQU3## and with the thresholds equal to 0, these U^{k} are the fixed points of the flow and thus stable states of the network. This is a type of "outer product storage" of the vectors U^{1},U^{2}, . . . ,U^{s}. Such a network can act as a contentaddressable memory as follows: the memories to be stored in the network are used to construct the U^{k} and hence T_{i},j, so the stored memories are fixed points of the flow. Then a given partial memory is input by using it to define the initial state of the network, and the state will flow usually to the closest fixed point/stable state U^{k} which is then the memory recalled upon input of the partial memory.
Further analysis and modified network models appear in, for example, J. Hopfield et al, Computing with Neural Circuits: A Model, 233 Science 625 (1986) and J. Hopfield, Neurons with Graded Response Have Collective Computational Properties like Those of TwoState Neurons, 81 Proc. Natl. Acad. Sci. U.S.A. 3088 (1984).
L. Cooper, A Possible Organization of Animal Memory and Learning, Proc. Nobel Symp. Coll. Prop. Phys. Sys. 252 (Academic, New York 1973) observes that the modelling of neural network for animal memory and learning has the problem of mapping events in the animal'"'"'s environment (i.e., sensory output) to signal distributions in the animal'"'"'s neurons with the fundamental property of preserving closeness or separateness (in some sense not yet completely defined) of the events. That is, with a vector representation of the neural network states, two events as similar as a white cat and a gray cat should map into vectors which are close to parallel while two events as different as the sound of a bell and the sight of food should map into vectors that are close to orthogonal. Note that standard analysis, such as described in Gonzalez and Wintz, Digital Image Processing (AddisonWesley 1977), does not use neural network computation and does not have this problem; rather, the standard analysis attempts to extract features and categorize by serial number crunching.
J. Anderson, Cognitive Capabilities of a Parallel System, NATO Advanced Research Workshop (Mar. 3, 1985) describes the Brain State in a Box (BSB) neural network model which includes outerproduct storage, WidrowHoff learning, and a rampedthreshold recall algorithm. That is, the matrix of interconnection strengths, T_{i},j, is modified to learn a new vector V_{j} by ##EQU4## where η is a learning constant and N is number of neurons. The learning constant is roughly the inverse of the number of times the matrix must be trained on a given vector before it fully learns the vector. The smaller the learning constant, the finer the resolution of the average direction for a learned state but the more times the input vectors must be trained. The learning procedure saturates when ΔT_{i},j is close to zero, which implies that the vector is close to being an eigenvector of the matrix with an eigenvalue near 1.
Recall of a learned (stored) vector given an input vector U_{j} is by the following iterative process that converges towards an eigenvector: ##EQU5## where the "Limit" function clamps the values in the range from 1 to 1. The constants γ and β measure the feedback and signal decay in the algorithm. This synchronous recall algorithm replaces the dynamic flow of the Hopfield model. The usual applications of the BSB neural network such as data bases with words and letters encoded as their ASCII representations require binary neurons as in the first Hopfield model.
The present invention provides analog behavior in a neural network to learn either binary or analog vectors and recall vectors that include information regarding the clustering of the learned vectors. Further, for multiple feature signal processing the features are concatenated to form vectors for learning but the learning is limited to avoid saturation. Preferred embodiments include use of such neural networks in avionics sensor systems wherein binary data learned by the network has some of its statistical properties revealed by the analog aspects of the recalled vectors. Preferred embodiments also use such neural networks for lookup table type information storage for emitter identification.
The drawings are schematic for clarity.
FIG. 1 illustrates in block form a first preferred embodiment sensor system;
FIG. 2 shows encoding for the first preferred embodiment;
FIG. 3 shows encoded data for the first preferred embodiment;
FIGS. 4ab illustrates hardware usable in the first preferred embodiment;
FIGS. 56 show learned data and recall data for the first preferred embodiment;
FIGS. 7ab show recall data;
FIG. 8 illustrates analog input data;
FIGS. 9ab illustrate analog recall;
FIG. 10 illustrates in block form a second preferred embodiment sensor system;
FIG. 11 is a flow chart for the second preferred embodiment clustering;
FIGS. 12ac illustrate clustering with the second preferred embodiment; and
FIGS. 13ab illustrate encoding for a third preferred embodiment.
The known radar sensor systems typically include an antenna, a preprocessor, a pulse buffer, a digital signal processor, and a classifier. The antenna receives radar signals from various emitters with various frequencies, pulse widths, pulse repetition rates, locations, and so forth. The antenna output is preprocessed to extract a set of features (frequency of the pulse, pulse width, signaltonoise ratio for the pulse, and azimuthal and elevational angles towards the emitter) and time of arrival for each received radar pulse and periodically store a table of such features in a buffer. The stored set of features is processed by standard signal processing methods to cluster the pulses from each perceived emitter, and the classifier compares the features of each perceived emitter with prestored data to determine its identity. All of this analysis would take place in a homing missle which would then decide which of the perceived emitters to attack.
FIG. 1 illustrates, in schematic block form, a first preferred embodiment radar sensor system 100 and includes antenna 102, preprocessor for feature extraction 104, encoder 106, neural network 108, and classifier 110. Antenna 102 and feature extractor 104 are the same as in the known systems. Antenna 102 receives radar pulses from various radar emitters 122, 124, 126, . . . , and the pulses are analyzed as received by feature extractor 104 according to the direction of the pulse (azimuth and elevation angles), the signaltonoise ratio, the frequency, and the pulse width to produce a data stream such as in the following table:
__________________________________________________________________________PulseAzimuth Elevation Sig/Noise Freq Pulse Width Time ofnumber(deg) (deg) (dB) (MHz) (μsec) arrival__________________________________________________________________________1 21 20 109 9066 2.186 12 34 35 98 9437 1.489 123 12 27 81 9214 0.399 204 6 34 85 10054 0.421 535 26 0 86 9210 0.397 586 23 17 108 9030 2.191 757 16 16 97 9342 1.399 978 21 22 108 9015 2.195 1129 25 30 83 9023 0.416 11710 19 21 109 0932 2.195 14911 8 29 83 9805 7.156 16412 20 17 109 9018 2.21 18613 20 19 96 9335 1.402 21314 23 21 108 9041 2.207 22315 32 30 98 9435 1.375 25116 24 21 108 9051 2.21 26017 22 20 109 9011 2.194 29718 19 17 97 9345 1.384 33019 25 20 109 8997 2.185 33420 23 19 109 9049 2.215 371__________________________________________________________________________
The data stream is temporarily stored in a pulse buffer. Combat conditions may be expected to yield an output from extractor 104 of 300 table entries during a 10 millisecond interval; at the end of each 10 millisecond interval the pulse buffer is unloaded. The data is encoded by encoder 106 to be compatible with binary neurons in neural network 108, although neural network 108 has analog neurons. In particular, each neuron of neural network 108 may take on real number values between 1 and 1 but the encoder 106 only produces outputs of 1 and 1. Neural network 108 has 400 neurons, and the neurons are grouped with 50 neurons for azimuth, 50 neurons for elevation, 50 neurons for signaltonoise, 150 neurons for frequency, and 100 neurons for pulse width. Thus a state of neural network 108 is a 400component vector of real numbers between 1 and 1 with the first 50 components relating to the azimuth, the second 50 relating to the elevation, the third 50 relating to the signaltonoise ratio, the next 150 relating to frequency, and the last 100 relating to the pulse width.
Encoder 106 uses a closeness code, its encoding method is illustrated in FIG. 2 for the case of encoding the azimuth data in degrees (ranging from 35 to +35 in 2 degree intervals) to network 108 compatible input vectors of 1'"'"'s and 1'"'"'s. FIG. 2 shows that a number is encoded by the position of a block of five +1'"'"'s in the otherwise all 1 components of the state vector. The other data is similarly encoded, except that the numerical interval corresponding to the next neuron depends upon the spread of the data. For example, the frequency data (expressed in MHz) has a spread of about 1100, so the 150 neurons could (with a block of 22 +1'"'"'s for the position indicator) represent 128 numbers which translates to an interval of about 9 MHz. The frequency data is encoded with the position of a block of twentytwo +1'"'"'s rather than the block of five +1'"'"'s of the other data because the frequency of an emitter may be varied to deter detection, and overlap of the blocks of +1'"'"'s for different vectors corresponding to the same emitter is required for network 108 to cluster the data to identify the emitters. This is explained in more detail in connection with the operation of network 108. Two vectors are close in Hamming distance if and only if the two sets of pulse feature numbers are close.
For example, a 400component vector [1 (18), +1 (5), 1 (53), +1 (5), 1 (55), +1 (5), 1 (80), +1 (22), 1 (110), +1 (5), 1 (42)] would represent a pulse with azimuthal angle of 0 degrees, elevational angle of 0 degrees, a signaltonoise ratio of 95 dB, a center frequency of 9500 MHz, and a pulse width of 2 μsec. This vector is more conveniently represented graphically as shown in FIG. 3.
FIG. 4a schematically illustrates a sixneuron neural network fabricated from standard electronic components. The network, generally denoted by the reference numeral 200, includes rows and columns of wires interconnected by an array of transconductances T_{i},j 202, the horizontal wires connect to op amps 206 and threshold linear array 204 and switch bank B. Threshold linear array 204 has a transfer function as shown shown in FIG. 4b and essentially just limits the neuron states to the range 1 to 1. Neuron states correspond to voltages on the vertical wires and the connecting righthand portions of the horizontal wires. Switch bank A (with switch bank B open) is used to input the initial network state, and the buffers 206 may incorporate the synchronous timing. Network 108 could be a 400 neuron version of network 200.
The operation of network 108 proceeds as follows:
A sequence of emitter pulse data (such as in the Table above) of perhaps 300 pulses is encoded as a sequence of 300 400component vectors with binary components (1 or 1) and each vector is learned by network 108 using the WidrowHoff learning rule with a learning rate of 0.035. That is, the matrix of interconnection strengths, T_{i},j, is modified to learn a new vector V_{j} by ##EQU6## where η=0.035 is the learning rate and N=400 is number of neurons. Each vector is learned four times. Note that if a vector W_{j} is repeatedly learned, then the interconnections tend to saturate (ΔT_{i},j →0) and thus W_{j} approximates an eigenvector of T_{i},j with eigenvalue λ approximately equal to 1. That is, for ΔT_{i},j ≈0 the term in the square brackets also approximates 0 for each i, which implies that for each i ##EQU7## with λ=1. Of course, only an approximation to an eigenvector is expected because only a finite number of learnings occur, the thresholding limits the neuron values, and so forth.
Due to the dynamics of the WidrowHoff learning rule, similar input vectors merge to form a single eigenvector. These eignevectors become averages of the pulse vector information that contribute to them. Because of the normalization process of the learning rule, the stable states of the network have nonbinary componentsessentially becoming histograms of the contributing vectors.
After the 300 vectors are learned, they are put into clusters (which presumably correspond to the various radar emitters which have emitted the original pulses received by antenna 102) as follows: the first vector is input as the initial condition (U_{j} (0)=V_{j}) for network 108 to recall a vector by the following iterative process that converges towards an eigenvector: ##EQU8## where the "Limit" function constrains the values in the range from 1 to 1 and corresponds to the transfer function of FIG. 4b. The constants γ and β measure the positive/negative feedback in the algorithm and are taken to be 0.8 and 0.4, respectively. Note that for two eigenvectors W_{j}, X_{j} of T_{i},j with eigenvalues λ, μ, respectively, with λ≈1 (e.g., W_{j} has been well learned) and μ≈0 (e.g., X_{j} has been poorly learned), the recall algorithm initialized at U_{j} (0)=W_{j} +X_{j} will yield
U.sub.j (n)=(γλ+η).sup.n W.sub.j +(γμ+η).sup.n X.sub.j
and thus the coefficient of W_{j} grows with increasing n and the coefficient of X_{j} shrinks to 0 with increasing n. So roughly the well learned eigenvector will be recalled and the poorly learned eigenvector will lost.
This first recalled vector is taken to be the first cluster vector, A, which presumably corresponds to one of the emitters 122, 124, 126, . . .
The second vector from the 300 vectors is then input as the initial condition to recall a second vector from network 108, and this second recalled vector is compared to A by taking the dot product with A. If the dot product divided by the products of the vector lengths is greater than 0.9, then the second vector is considered to also correspond to the cluster with A, and the process continues to the third vector from the 300 vectors. Contrarily, if the second recalled vector has a dot product with A divided by the product of the second recalled vector length and the length of A is less than 0.9, then the second recalled vector is taken to be the second cluster vector, B. Note that the number of clusters is expected to be insensitive to the dot product cutoff for a cutoff between 0.9 and 0.99, but if 0.9 is replaced with a smaller number, then likely fewer clusters will be found. The examplar vector for each cluster will be the initial recalled vector after several iterations of the recall procedure.
In a similar manner, each of the vectors from the 300 vectors is input as the initial condition to recall a vector from network 108, and the recalled vector is compared to the previously determined cluster vectors A, B, . . . to decide whether a new cluster vector has been found or whether the recalled vector correpsonds to one of the previously found cluster vectors.
This is illustrated in FIGS. 5 and 6 which show the input pulses (out of time order) and the corresponding recalled results. The recalled vectors are the eigenvectors of the similar, learned vectors (note that all of the recalled vectors look the same and are approximations to a common eigenvector). The nonbinary aspect of the recalled vector is a measure of the variation (noise) of the input data. Thus the cluster vectors A, B, . . . contain information in their shapes with regard to the variation of the original data. This information is a statistical summary of the orginal data that has been generated without any presumption as to the probability distribution of the original data as is typically done with standard digital signal processing in the known methods. Indeed, binary vectors were learned and the resulting cluster vectors recalled were an analogized version of binary vectors with input variation information determining the analog aspects. This shows why the frequency data was encoded with a block of twentytwo +1'"'"'s to insure overlap of all of the blocks for an emitter despite the variation in pulse frequencies by that emitter.
Once the vectors are clustered, then the pulse repetition rate can be computed from the time of arrival data in the pulse buffer but which was not encoded in the vectors. The pulse repetition rate information can then be encoded and concatenated with the cluster vectors or handled separately, although this is optional. The cluster vectors A, B, . . . are then compared to stored information on known radar emitters by classifier 110 to identify the emitters corresponding to the clusters. Of course, the extrinsic information (azimuthal and elevation angles and signaltonoise ratios) relates to the location of an emitter and the intrinsic information (frequency, pulse width, and pulse repetition rate) relate to the identity of an emitter, so only the intrinsic information portion of a cluster vector is used for comparison with known emitter types. The identities of the emitters are then used to determine subsequent actions. Note that the use of multiple features concatenated together in the encoded vectors together with a low precision for each feature (learning only a fews times for each encoded vector) allows an effective extraction of features including some statistical information.
Classifier 110 includes a neural network that has already learned the intrinsic information on known emitter types, and thus already has approximate eigenvectors with shapes similar to those of the intrinsic information portion of the cluster vectors. The intrinsic information for an emitter may contain analog values. For example, an emitter may be known to have a high variation in its frequency. Thus the vector trained for this emitter would have a large, Gaussian distribution for the frequency (FIG. 7a). As another example of the storage of analog data, an emitter may have two pulse widths. Instead of storing them as two separate entries in the classifier network, a hybrid vector, containing the analog histogram of the data, can be stored (FIG. 7b). Classifier 110 identifies the emitter type corresponding to each of the cluster vectors by successively inputting each cluster vector and recalling an emittertype vector by the same recall algorithm that converges towards an eigenvector: ##EQU9## where again the "Limit" function constrains the values to the range from 1 to 1 and corresponds to the transfer function of FIG. 3b. The constants γ and β are taken to be 0.73 and 0.3, respectively, and the iteration is performed ten times. Note that the length of the iteration vectors is not much different from that of the input because 0.73+0.3 is close to 1.0, so the neurons are not pushed into saturation and the analog shape aspects are recoverable.
The preferred embodiment has the capacity to include analog data from the feature extractor 104 of FIG. 1; such analog data may be a frequency spectrum for the pulses being detected. More explicitly, feature extractor 104 may have the capability to provide an entire frequency spectrum estimate for a received pulse rather than just the center frequency. FIG. 8 illustrates a typical spectrum. This spectral information could be reduced to statistics such as mean, variance, number of nodes, skewness, kurtosis, and relative sidelobe size, and encoded in the manner of encoding the azimuth, elevation, and pulse width. But the entire spectrum can also be normalized to fall between 1 and +1 and directly encoded: if feature extractor 104 yields a spectrum by a sampling of 60 frequencies (roughly as illustrated in FIG. 11), then 60 neurons are used for this feature and 60 components of each input vector will have values anywhere between 1 and +1 and will just equal the normalized spectrum values. In this manner all of the information in the spectrum can be used.
Learning of directly encoded features proceeds just as for closeness encoded features. The recall requires γ+β to be close to 1 to preserve the analog shape and avoid saturation to 1 and +1 of the recalled vector components. The recall of directly encoded features is complicated if the feature also varies, because the learning effectively averages input vectors and so the analog portion of the recalled vector is a mixture of both the variation and the input shapes. However, if an emitter uses the same spectrum and just varies the center frequency, then the spectrum feature is nonvarying. Similarly, if an emitter switches between a small number of spectra, then these spectra may be learned and recalled as distinct, and the classifier of the cluster vectors would recognize them as indicating the same emitter.
For example, FIG. 9a shows three analog input vectors (using 400 neurons) having sine wave shapes with frequencies 3, 6, and 9 that were learned and then each input vector was used as the initial condition "input" for a recall with the recalled vector shown as "output". The recall had γ=0.73, β=0.3, and used ten iterations. The recall preserved the shapes as shown.
FIG. 9b shows the same neural network as FIG. 9a but with a mixed "input" consisting of the trigonometric sum of one of the three learned sine waves plus a sine wave of frequency 4; the recall "output" is just the learned sine wave without any frequency 4 contamination. Thus analog vectors even in the face of mixed analog input can be recalled.
The second preferred embodiment system and method apply to the same sort of problems as the first preferred embodiment and will be described in terms of the same data. The second preferred embodiment follows the first preferred embodiment with regard to the preprocessing of antenna output, the storing of the extracted features in a buffer, the periodical unloading of the buffer and encoding by closeness code the data for neural network compatibility (400component binary vectors with component values of 1 or +1), and the learning of each input vector by the neural network using the WidrowHoff learning rule. The second preferred embodiment uses a learning rate of η=4 with each input vector being learned two times.
The second preferred embodiment is illustrated in block diagram in FIG. 10 and clusters the input vectors as illustrated in FIG. 11. The second preferred embodiment includes antenna 202, feature extractor 204, encoder 206, neural network 208, and classifier 210 which are analogous to those of the first preferred embodiment for detecting emitters 222, 224, and 226. The second preferred embodiment also has tracker 212 for listing cluster vectors as they are generated and for prescreening input vectors from encoder 206 prior to learning by neural network 208. This prescreening picks out input vectors that correspond to any cluster vector in tracker 212 and thereby avoids extraneous learning by neural network 208. This generates a list of cluster vectors (together with feature spreads) in tracker 212 and a clustering of the input vectors into groups with all of the vectors in a group corresponding to a single cluster vector. Once the input vectors have been clustered, then the first preferred embodiment can again be followed to determine pulse repetition rate, compare cluster vectors (optionally including pulse repetition rate information) to known emitters, and so forth. Also, the directly encoded analog portions (if any) of input vectors could be included, but this will slow down the prescreening by tracker 212.
In more detail, this second preferred embodiment clustering proceeds as follows.
(a) The first input vector, V^{1}, is used as the initial condition (U_{j} (0)=V_{j}^{1}) for a recall from the neural network (which has learned all the input vectors) using ten iterations in ##EQU10## As explained supra the "Limit" function constrains the values to the range from 1 to 1, and γ and β are constants which set the positive/negative feedback in the algorithm and are taken to be 0.8 and 0.4, respectively. This first recalled vector U^{1} will typically look like the vector illustrated in FIG. 12a. The features with variation in values among related input vectors (such as frqeuency variation by an emitter) lead to the analog (nonbinary) aspects of the recalled vector. In particular, each analog portion of U^{1} has a zerotozero part of width w and a spread s with the width measured at neuron value zero (half maximum) and the spread measured as the larger of the two ramp widths, s_{1} and s_{2}, between 1 and +1 as illustrated in FIG. 12a.
(b) Then U^{1} is used to define a first cluster vector, A, by copying each allbinary feature portion of U^{1} and by using the zerotozero part of each analog feature portion of U^{1} in binary form. This is illustrated in FIG. 12b and mathematically amounts to mapping U^{1} to A by replacing a component U_{j}^{1} of U^{1} with a +1 if U_{j}^{1} ≧0 and replacing the component U_{j}^{1} with a 1 if U_{j}^{1} <0. The cluster vector A together with the spread(s) s of U^{1} are stored in a local buffer.
(c) Next, the second input vector, say V^{2}, is compared to the cluster vector A feature by feature, and if each difference (Hamming distance) is less than 110% of the corresponding s for the feature, then V^{2} is taken to correspond to the cluster vector A. Note that if the analog portion of U^{1} is roughly an averaging of blocks in varying positions, then the extreme of the analog portion indicates the maximum variation from the mean position, and that this, by an expected rough symmetry of the analog portion about zero neuron values, should be about s/2. But if a block is offset by s/2, then the Hamming distance is about s since both ends of the block are offset. The 110% is to include a 10% safety margin.
For a feature encoded with a block of five +1'"'"'s (such as the azimuth data) and a first recalled vector U^{1} also with a block of +1'"'"'s for this feature, the corresponding s is zero because there is no spread, so the cluster vector A will have the same block of +1'"'"'s and an input vector V^{j} will correspond to A only if the blocks of +1'"'"'s exactly match for this feature. Contrarily, for a feature encoded with a block of twentytwo 1+'"'"'s (such as frequency), the first recalled vector may have a zerotozero width w of twentythree and a spread s of thirteen (s_{1} =12, s_{2} =13), so the second input vector V^{2} will correspond to the cluster vector A only if overlap of the block of twentytwo +1'"'"'s in V^{2} on the block of twentythree +1'"'"'s in A is at least sixteen (so the offsets at the two ends are are six and seven which add to thirteen which is less than 110% of s). FIG. 12c shows the overlap for V^{2} corresponding to A: all of the five +1'"'"'s blocks exactly match and the twentytwo/twentythree blocks have an overlap of nineteen.
(d) (i) If V^{2} does correspond to A, then step (c) is repeated with the third input vector V^{3}. in place of V^{2}. If V^{3} also corresponds to A, then go to V^{4} ; contrarily, if V^{3} does not correspond to A, then go to (ii) with V^{3} in place of V^{2}. (ii) If V^{2} does not correspond to A, then step (a) is repeated with V^{2} used in place of V^{1} as the initial condition for a recall to yield U^{2}. Then U^{2} defines a second cluster vector B as in step (b) together with spread(s) for each feature. The next input vector V^{3} is first compared to A and B and if it corresponds to neither, then go to (ii) with V^{3} in place of V^{2} and A and B in place of A.
(e) Continue to the remaining input vectors.
Thus the clustering method differs from the first preferred embodiment by essentially cleaning up the analog portions of a recalled vector to have a binary cluster vector while using the size of the analog log portion to set the tolerance for comparisons of the input vectors to the cluster vector. The second preferred embodiment also prescreens the input vectors by comparison to the cluster vectors found so far rather than using each input vector as the initial condition for a recall; this saves on computation in that a Hamming distance comparison of binary vectors is used in place of a ten iteration recall.
Of course, the clustering method may be modified in many ways such as a cluster vector could be defined from a recalled vector using a 1to1 width (essentially w+s) block of +1'"'"'s with a comparison tolerance of s, a +1to+1 width (essentially ws) block with a comparison tolerance of s, or intermediate size blocks; the comparison tolerance may be 110% s, s, 125% s, a single tolerance for all of the analog portions (when the recalled vector has more than one analog portion) together could be used, and so forth.
Alternatively, the cluster vector may be taken to be the recalled vector (as in the first preferred embodiment), but the comparison tolerance may be determined by the analog spread. In this case the comparison is between a binary input vector and a (possibly) analog cluster vector, so the comparison should be a sort of integration and the tolerance may be various function of the analog spread such as the total area. But this will slow down the comparison.
A third preferred embodiment uses thermometer encoding rather than closeness encoding for converting the preprocessed data to neural network compatible vectors. FIG. 13a illustrates thermometer encoding of the same data that was closeness encoded in FIG. 3. Thermometer encoding uses the size of a block of +1'"'"'s to represent a number rather than the position of a setsize block as in the closeness encoding of the first two preferred embodiments. Of course, each feature has a corresponding set of neurons and vector components, so the blocks of +1'"'"'s start at the set boundaries.
Thermometer encoding has the advantage that block size need not be guessed because variable or noisy data just spreads out the block rather than move a fixedsize block to where there may be no overlap with other fixedsize blocks for the same feature. With thermometer encoding the first two preferred embodiments methods of learning the input vectors, recalling vectors, defining cluster vectors, and clustering the input vectors can be performed in a manner analogous to those described for closeness encoding. Indeed, the WidrowHoff learning and BSB recall formulas can be used without change.
For example, a recalled vector with thermometer encoding will show analog feature portions corresponding to varying or noisy data in the same manner as closeness encoded recalled vectors having sufficient blocksize. FIG. 13b illustrates a typical thermometercoded recalled vector where the frequency feature (components 151300) has variability for the corresponding emitter, but the other features do not.
Following the first preferred embodiment, each thermometercoded input vector is learned by the neural network, then each input vector is used as initial condition to recall a cluster vector (similar to that illustrated in FIG. 13b) or a vector corresponds to a previously recalled cluster vector. A vector is taken to correspond to a second vector if the dot product divided by the vector lengths is greater than a preselected threshold, 0.9. This generates the clustering for thermometercoded vectors just as in the first preferred embodiment.
Following the second preferred embodiment, each thermometercoded input vector is learned by the neural network, then the first input vector is used as initial condition to recall a first recalled vector (such as illustrated in FIG. 13b) and the first recalled vector defines a first cluster vector by replacing each analog portion with a block of +1'"'"'s of width equal to the zerotozero width of the analog portion (w in FIG. 13b). The analog portions also each define a spread (s in FIG. 13b) which is used for determination of vector correspondence as in the second preferred embodiment. But note that a vector corresponds to a cluster vector if the Hamming distance between the each feature portion of the two vectors is less than half of the spread multiplied by whatever safety margin being used because the thermometercoded blocks all begin at the same component and the offset is only on one end.
Various modifications of the preferred embodiment encoders and methods may be made while retaining the aspects of analog treatment of binary neural network data for extraction of statistical information and of analog performance of neural networks, and the use of many low precision concatenated features to form the network vectors.
For example, other types of information and signals may be processed, a neural network may be used for the initial clustering and standard digital signal processing for the classification, a neural network may be used for the classification with standard signal processing for the clustering, after a first clustering additional information may be determined (e.g., pulse repetition rate) and concatenated onto the vectors and other information deleted from the vectors (e.g., extrinsic), the number of neurons in the network can be varied, the encoding by closeness code may be with different length blocks or even with the neurons corresponding to the block permuted (so the representation of FIG. 3 would have the blocks split apart and shuffled), the thermometer code encoding also may have the vector components shuffled, but the number of analog components will remain the same and can be used to measure spread for either encoding, the encoding may have levels of precision such as a block for a more significant portion of a number and another block for a less significant portion of the number, the recall parameters may be varied, the learning constant and the number of learning presentations may be varied, the number of recall steps may be varied, . . .