Method and system for wavefront reconstruction

0Associated
Cases 
0Associated
Defendants 
0Accused
Products 
39Forward
Citations 
0
Petitions 
1
Assignment
First Claim
1. A method to be employed in conjunction with a radiation system, the radiation system comprising:
 (a) at least one source of radiation;
(b) at least one imaging device for imaging at least a portion of the radiation emitted by the or each source onto at least one image plane;
the or each imaging device functioning to produce at the image plane an intensity distribution profile; and
(c) a detector array for recording the intensity distribution profile at the image plane;
the method comprising the steps of;
(1) defining a feature vector derived from the intensity distribution profile; and
(2) employing an adaptive computational architecture for mapping the feature vector to at least one identifying characteristic of a selected imaging device.
1 Assignment
0 Petitions
Accused Products
Abstract
Method and system for wavefront reconstruction from an image plane intensity distribution profile. An imaging device may be an agent for producing the image plane intensity distribution profile, for example, a point spread function. In one embodiment, the method and system include defining a feature vector from the point spread function, and employing an adaptive computational architecture for associating the feature vector with at least one identifying characteristic of the imaging device, e.g., such as an amount of astigmatism.
43 Citations
View as Search Results
Contextual data mapping, searching and retrieval  
Patent #
US 20080228761A1
Filed 03/10/2008

Current Assignee
1759304 ONTARIO INC.

Sponsoring Entity
1759304 ONTARIO INC.

Neural network for character recognition and verification  
Patent #
US 5,742,702 A
Filed 08/09/1996

Current Assignee
Sony Electronics Inc., Sony Corporation

Sponsoring Entity
Sony Electronics Inc., Sony Corporation

Neural network for character recognition of rotated characters  
Patent #
US 5,319,722 A
Filed 10/01/1992

Current Assignee
Sony Electronics Inc.

Sponsoring Entity
Sony Electronics Inc.

Graphical system for automated segmentation and recognition for image recognition systems  
Patent #
US 5,487,117 A
Filed 10/11/1994

Current Assignee
ATT Inc.

Sponsoring Entity
ATT Inc.

Symbol Classification with shape features applied to neural network  
Patent #
US 6,731,788 B1
Filed 11/17/1999

Current Assignee
Koninklijke Philips N.V.

Sponsoring Entity
Koninklijke Philips N.V.

Method for a neural network for representing imaging functions  
Patent #
US 6,243,489 B1
Filed 05/14/1998

Current Assignee
Siemens AG

Sponsoring Entity
Siemens AG

Method and apparatus for machine learning  
Patent #
US 6,249,781 B1
Filed 05/05/1999

Current Assignee
Verizon Laboratories Incorporated

Sponsoring Entity
Verizon Laboratories Incorporated

Hybrid neural network classifier, systems and methods  
Patent #
US 5,943,661 A
Filed 07/11/1991

Current Assignee
Intel Corporation

Sponsoring Entity
Texas Instruments Inc.

Apparatus for machine learning  
Patent #
US 5,946,675 A
Filed 11/20/1992

Current Assignee
Verizon Laboratories Incorporated

Sponsoring Entity
GTE Laboratories Incorporated

Method and apparatus for adjusting readout conditions and/or image processing conditions for radiation images, radiation image readout apparatus, and radiation image analyzing method and apparatus  
Patent #
US 5,999,638 A
Filed 01/29/1996

Current Assignee
Fujifilm Corporation

Sponsoring Entity
Fuji Photo Film Co Limited

Compact ocular measuring system  
Patent #
US 6,007,204 A
Filed 06/03/1998

Current Assignee
Welch Allyn Incorporated

Sponsoring Entity
Welch Allyn Incorporated

Adaptive network for automated first break picking of seismic refraction events and method of operating the same  
Patent #
US 5,742,740 A
Filed 01/15/1993

Current Assignee
Atlantic Richfield Company Incorporated

Sponsoring Entity
Atlantic Richfield Company Incorporated

Method for constructing a neural device for classification of objects  
Patent #
US 5,802,507 A
Filed 07/07/1997

Current Assignee
US Philips Corporation

Sponsoring Entity
US Philips Corporation

Method and apparatus for adjusting readout conditions and/or image processing conditions for radiation images , radiation image readout apparatus, and radiation image analyzing method and apparatus  
Patent #
US 5,828,775 A
Filed 05/28/1997

Current Assignee
Fujifilm Corporation

Sponsoring Entity
Fuji Photo Film Co Limited

Learning method and neural network structure  
Patent #
US 5,630,020 A
Filed 12/21/1994

Current Assignee
US Philips Corporation

Sponsoring Entity
US Philips Corporation

Neural device and method of constructing the device  
Patent #
US 5,649,067 A
Filed 06/05/1995

Current Assignee
US Philips Corporation

Sponsoring Entity
US Philips Corporation

Method and apparatus for adjusting readout conditions and/or image  
Patent #
US 5,515,450 A
Filed 12/09/1993

Current Assignee
Fujifilm Corporation

Sponsoring Entity
Fuji Photo Film Co Limited

Radiation image processing method utilizing neural networks  
Patent #
US 5,553,159 A
Filed 04/10/1992

Current Assignee
Fujifilm Corporation

Sponsoring Entity
Fuji Photo Film Co Limited

Method for processing data using a neural network having a number of layers equal to an abstraction degree of the pattern to be processed  
Patent #
US 5,553,196 A
Filed 06/05/1995

Current Assignee
YOZAN Inc., Sharp Corporation

Sponsoring Entity
YOZAN Inc., Sharp Corporation

Artificial neural network method and architecture  
Patent #
US 5,408,588 A
Filed 05/18/1993

Current Assignee
Mehmet E. Ulug

Sponsoring Entity
Mehmet E. Ulug

Pattern recognition neural network  
Patent #
US 5,440,651 A
Filed 04/20/1993

Current Assignee
Microelectronics and Computer Technology Corporation

Sponsoring Entity
Microelectronics and Computer Technology Corporation

Artificial neural network method and architecture adaptive signal filtering  
Patent #
US 5,467,428 A
Filed 08/15/1994

Current Assignee
Mehmet E. Ulug

Sponsoring Entity
Mehmet E. Ulug

Intelligence information processing method  
Patent #
US 5,479,569 A
Filed 04/30/1993

Current Assignee
Mitsubishi Electric Corporation

Sponsoring Entity
Mitsubishi Electric Corporation

Method and system for automatically classifying intracardiac electrograms  
Patent #
US 5,280,792 A
Filed 09/18/1992

Current Assignee
University of Sydney

Sponsoring Entity
University of Sydney

Manufacturing adjustment during article fabrication  
Patent #
US 5,283,746 A
Filed 02/25/1993

Current Assignee
NEC Corporation

Sponsoring Entity
ATT Inc.

Neural network pattern recognition learning method  
Patent #
US 5,317,675 A
Filed 06/27/1991

Current Assignee
Toshiba Corporation

Sponsoring Entity
Toshiba Corporation

Accelerated training apparatus for back propagation networks  
Patent #
US 5,228,113 A
Filed 06/17/1991

Current Assignee
The United States of America As Represented By The Secretary of Agriculture

Sponsoring Entity
The United States of America As Represented By The Secretary of Agriculture

Intelligence information processing system  
Patent #
US 5,257,343 A
Filed 08/14/1991

Current Assignee
Mitsubishi Electric Corporation

Sponsoring Entity
Mitsubishi Electric Corporation

Parameter normalized features for classification procedures, systems and methods  
Patent #
US 5,263,097 A
Filed 07/24/1991

Current Assignee
Texas Instruments Inc.

Sponsoring Entity
Texas Instruments Inc.

Selfextending neuralnetwork  
Patent #
US 5,033,006 A
Filed 03/07/1990

Current Assignee
Sharp Electronics Corporation

Sponsoring Entity
Sharp Electronics Corporation

Neural network with back propagation controlled through an output confidence measure  
Patent #
US 5,052,043 A
Filed 05/07/1990

Current Assignee
Eastman Kodak Company

Sponsoring Entity
Eastman Kodak Company

Hierarchical constrained automatic learning network for character recognition  
Patent #
US 5,058,179 A
Filed 01/31/1990

Current Assignee
DANA TRANSMISSIONS INC.

Sponsoring Entity
American Telephone Telegraph

Analog hardware for learning neural networks  
Patent #
US 5,056,037 A
Filed 12/28/1989

Current Assignee
The United States of America As Represented By The Secretary of Agriculture

Sponsoring Entity
The United States of America As Represented By The Secretary of Agriculture

Contextual data mapping, searching and retrieval  
Patent #
US 8,266,145 B2
Filed 03/10/2008

Current Assignee
1759304 ONTARIO INC.

Sponsoring Entity
1759304 ONTARIO INC.

Alarm system controller and a method for controlling an alarm system  
Patent #
US 8,369,967 B2
Filed 03/07/2011

Current Assignee
HOFFBERG FAMILY TRUST 1

Sponsoring Entity
STEVEN M. HOFFBERG 20041 GRAT

Internet appliance system and method  
Patent #
US 8,583,263 B2
Filed 03/08/2011

Current Assignee
HOFFBERG FAMILY TRUST 1

Sponsoring Entity
HOFFBERG FAMILY TRUST 1

Adaptive pattern recognition based controller apparatus and method and humaninterface therefore  
Patent #
US 8,892,495 B2
Filed 01/08/2013

Current Assignee
HOFFBERG FAMILY TRUST 1

Sponsoring Entity
HOFFBERG FAMILY TRUST 1

Internet appliance system and method  
Patent #
US 9,535,563 B2
Filed 11/12/2013

Current Assignee
HOFFBERG FAMILY TRUST 1

Sponsoring Entity
HOFFBERG FAMILY TRUST 1

Adaptive pattern recognition based control system and method  
Patent #
US 10,361,802 B1
Filed 02/02/2000

Current Assignee
HOFFBERG FAMILY TRUST 1

Sponsoring Entity
HOFFBERG FAMILY TRUST 1

High speed/low light wavefront sensor system  
Patent #
H000615H1
Filed 01/22/1988

Current Assignee
The United States of America As Represented By The Secretary of Agriculture

Sponsoring Entity
The United States of America As Represented By The Secretary of Agriculture

Sensing device for ascertaining imaging errors  
Patent #
US 4,666,298 A
Filed 04/10/1985

Current Assignee
MesserschmittBoelkowBlohm GmbH

Sponsoring Entity
MesserschmittBoelkowBlohm GmbH

Wave front sensor  
Patent #
US 4,490,039 A
Filed 03/25/1983

Current Assignee
AOA Xinetics

Sponsoring Entity
United Technologies Corporation

Modal sensor  
Patent #
US 4,344,707 A
Filed 05/14/1980

Current Assignee
Rockwell International Corporation

Sponsoring Entity
Rockwell International Corporation

23 Claims
 1. A method to be employed in conjunction with a radiation system, the radiation system comprising:
(a) at least one source of radiation; (b) at least one imaging device for imaging at least a portion of the radiation emitted by the or each source onto at least one image plane; the or each imaging device functioning to produce at the image plane an intensity distribution profile; and (c) a detector array for recording the intensity distribution profile at the image plane; the method comprising the steps of; (1) defining a feature vector derived from the intensity distribution profile; and (2) employing an adaptive computational architecture for mapping the feature vector to at least one identifying characteristic of a selected imaging device.  View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
 6. A method according to claim 1, comprising learning steps of deriving the feature vector by:
(a) inserting into the radiation system a set of n different imaging devices, each imaging device having known aberrations, for yielding n unique point spread functions; and (b) developing a multidimensional moment vector for each of the n unique point spread functions.
 7. A method according to claim 1, comprising deriving the feature vector from an energy spectra.
 8. A method according to claim 1, wherein the step of employing the adaptive computational architecture comprises:
(a) providing a preliminary learning mode; and (b) providing a subsequent realtime processing mode.
 9. A method according to claim 8, comprising providing a nonlinear neural network computational architecture.
 10. A method according to claim 9, wherein the learning mode comprises the steps of:
iteratively adjusting a set of weighting factors defined by the neural network, by associating a succesion of known feature vectors with a succession of known aberrations.
 11. A method according to claim 10, wherein the real time processing mode comprises the step of:
characterizing a heretofore arbitrary feature vector, by processing the arbitrary feature vector through the neural network, the characterizing, at least in part, based on the learning mode.
 12. A method according to claim 8, wherein the adaptive computational architecture comprises providing a statistical learning algorithm.
 13. A method according to claim 12, comprising the learning mode steps of:
(a) locating a known feature vector in a multidimensional feature vector space, the vector space corresponding to terms of orthogonal basis vectors; and (b) developing a region in the vector space, the region defined by the known feature vector and statistical noise components.
 14. A method according to claim 13, comprising the realtime processing mode step of:
characterizing a heretofore arbitrary feature vector by determining its location with respect to the region.
 15. A radiation system comprising:
(a) at least one source of radiation; (b) at least one imaging device for imaging at least a portion of the radiation emitted by the or each source onto at least one image plane; the or each imaging device functioning to produce at the image plane an intensity distribution profile; (c) a detector array for recording the intensity distribution profile at the image plane; (d) means for defining a feature vector derived from the intensity distribution profile; and (e) means for mapping the feature vector to at least one identifying characteristic of a selected imaging device.  View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23)
1 Specification
1. Field Of The Invention
This invention relates to radiation systems comprising a source of radiation and an imaging device.
2. Introduction To The Invention
Radiation systems comprising a source of radiation and an imaging device can cooperate so that the imaging device functions to produce a radiation field or radiation wavefront function at an image plane, for example, a focal plane. In particular, the source of radiation may be a microwave source, or an optical monochromatic point source, and the imaging device for the latter may be a mirror, a lens or a grating. The imaged radiation wavefront function produced by the imaging device at a selected image plane can provide a measure of the response of the system, i.e., the wavefront function includes the amplitude and the phase distribution of the radiation field as it is transformed by the imaging device.
For the radiation system just described, and under appropriate and ideal circumstances, the wavefront emerging from the imaging device at an entrance pupil is spherical. For real or practical radiation systems, in sharp contrast, the wavefront emerging from the imaging device at the entrance pupil is not spherical, but may contain "aberrations" which generally degrade the image quality at the image plane. These aberrations can be described quantitatively in terms of radiation phase variations over the entrance pupil, and include, e.g., the wellknown monochromatic aberrations such as spherical aberration, coma, or astigmatism.
We are working on the critical problem of determining whether or not the imaging device (or other system component or media) may have introduced an unknown aberration into the radiation field. Further, we want to be able to specify what the aberration is, and how much the aberration is, including mixtures of aberrations, like coma plus astigmatism. The importance of this effort is the following. Once the "status" of the radiation system is determined, and by status we mean identifying whether or not there are aberrations and which ones, we can then provide correction techniques to correct or minimize the known aberrations. These correction techniques, in turn, include, e.g., applying corrective or compensative forces to the imaging device, or some other component of the radiation system, by way of force actuators. Alternatively, the correction techniques can include a postoperative capability to process and compensate image degradation by way of, e.g., appropriate software programs.
We indicate above that the aberrations can be described quantitatively in terms of radiation phase variations over the entrance pupil. It is possible to compute the desired radiation phase variations (as well as the amplitude) at the image plane in accordance with the wellknown Fourier Transform. Thus, in a "forward" application of a twodimensional Fourier Transform of a wavefront having a known amplitude and phase variations over the entrance pupil, one derives the amplitude and phase variations of the radiation wavefront at the image plane. Moreover, in an "inverse" application of the twodimensional Fourier Transform, the wavefront aberrations over the entrance pupil can be completely reconstructed from a known amplitude and phase of the radiation field in the image plane.
In practice, this wavefront "reconstruction" can be effected in the following twofold way. First, an image plane sensor, for example a detector array comprising a matrix of photodiodes, each of which photodiodes provides a signal of magnitude related to the intensity of the radiation field incident thereon, can be employed to develop an intensity point spread function (PSF). Now, the PSF may be defined as the magnitude squared of the Fourier Transform of the wavefront that represents the aberrated wavefront at the entrance pupil of the imaging device. In general, however, the resulting PSF contains only magnitude or amplitude information, and this amplitude information is not sufficient to reconstruct the wavefront aberrations over the entrance pupil. Accordingly, secondly, interferometric techniques, including TwymanGreen or Fizeau interferometers, may be employed to capture the image plane phase information. In consequence, by way of the twodimensional inverse Fourier Transform, the wavefront aberrations over the entrance pupil can be completely reconstructed from the (now) known amplitude and phase of the radiation in the image plane.
Recall that we are working on the problem of determining whether or not the imaging device may have introduced aberrations into the radiation field. This is another way of stating that, initially, we cannot avail ourselves of the forward application of the Fourier Transform to determine the phase variations (if any) in the image plane, since we do not yet know what is the wavefront function at the entrance pupil. Moreover, in terms of the inverse use of the Fourier Transform, we have found it impractical to employ an interferometer to capture the image plane phase information, since it may be quite difficult in a real world environment to replicate laboratory conditions that are required to insure that the interferometer has stringent vibration isolation, and satisfies coherence limitations. In short, we have the problem of determining aberrations reconstructed on the basis of data provided by the image plane sensor, alone. This problem may be restated in the following way: determine the entrance pupil plane phase aberrations from a knowledge of an imageplane intensity distribution, e.g., an intensity point spread function.
It is observed that calculating the phase aberrations from the imageplane intensity distribution is illposed, because of an inherent loss of (phase) information by the image plane sensor. Accordingly, the problem to be solved, "phaseretrieval", requires additional information or system constraints on the entrance pupil wavefront and its aberrations.
One type of constraint within the context of phaseretrieval is to assume that the geometry and wavefront amplitude distribution over the entrance pupil are known. Additionally, one can also assume thaat the phase aberrations are representable parametrically; e.g., in terms of a Zernike polynomial. These constraints reduce the problem to one whose solution can be expressed in terms of a finite number of unknown parameters.
Under the assumption of an adequate parametric model, the phaseretrieval problem can be solved by finding the parameter values which, when put through the forward Fourier Transform, produce the measured image data. Conceptually, the correct set of parameter values can be obtained via an exhaustive search. In practice, however, such an approach may not be feasible, and efficient strategies need to be developed for searching a parameter space.
We have now discovered a different approach to developing an efficient search strategy. In sharp contrast to known techniques for addressing the phase retrieval problem on its own terms, namely, searching a parameter space for the phase and/or estimating a phase and then iteratively improving the estimate, we provide a method and system for wavefront reconstruction from the image plane intensity distribution. Accordingly, in a first aspect, the present invention provides a novel method to be employed in conjunction with a radiation system, the radiation system comprising:
(1) at least one source of radiation;
(2) at least one imaging device for imaging at least a portion of the radiation emitted by the or each source onto at least one image plane;
the or each imaging device functioning to produce at the image plane an intensity distribution profile; and
(3) a detector array located at the image plane for realizing the intensity distribution profile;
the method comprising the steps of:
(a) defining a feature vector derived from the intensity distribution profile; and
(b) employing an adaptive computational architecture for associating the feature vector with at least one identifying characteristic of a selected imaging device.
One advantage of the novel method is that it can be employed to provide real time corrections of the radiation system, based upon the or each identifying characteristic of the selected imaging device. For example, the method can determine that the imaging device, say a mirror, has an identified characteristic, such as a given amount of trefoil. Accordingly, the force actuators disclosed above may be used to provide real time corrections to the mirror.
Another important advantage of the method is that the adaptive computational architecture has an inherent learning capability. That is, the association between an arbitrary feature vector derived from the intensity distribution profile, and the identified characteristics of the imaging device, may be obtained by training from previously analyzed examples. As discussed below, the training may advantageously include, e.g., an employment of neural networks and/or statistical algorithms and methods, for associating a feature vector with a known characteristic of the imaging device. The training capability, in turn, implies that the novel method can be exploited to accommodate and correct critical changes in the radiation system, including new characteristics of an imaging device.
The first method step of defining a feature vector derived from the intensity distribution profile subsumes the following general considerations. In one embodiment, the intensity distribution profile recorded by imaging a point source on a charge coupled device, or other recording media, is the point spread function. A feature vector comprising spatial moments, as described in detail below, may be derived from the point spread function. Alternatively, the intensity distribution profile recorded from an extended source may be used to derive the feature vector comprising power spectral components.
The second method step of employing an adaptive computational architecture for associating the feature vector with at least one identifying characteristic of the imaging device subsumes the following general considerations. The adaptive computational architecture preferably is nonlinear, although it may be linear. As indicated briefly above, the architecture provides a learning capability, and this may be realized by statistical learning algorithms, and/or a preferred employment of neural network technology, as discussed in detail below.
In a second aspect, the present invention provides a novel radiation system comprising:
(1) at least one source of radiation;
(2) at least one imaging device for imaging at least a portion of the radiation emitted by the or each source onto at least one image plane;
the or each imaging device functioning to produce at the image plane an intensity distribution profile;
(3) a detector array located at the image plane for realizing the intensity distribution profile;
(4) means for defining a feature vector derived from the intensity distribution profile; and
(5) means for associating the feature vector with at least one identifying characteristic of a selected imaging device.
We now briefly address some preferred features of the novel radiation system.
The source of radiation preferably comprises a single point source emitting monochromatic radiation. It may, however, also be an extended source. The source of radiation may subsume the entire electromagnetic spectrum, although preferably, the source of radiation is defined by the optical band of frequencies. The imaging device may comprise, for example, a lens, a mirror or a spatial grating, as appropriate. The detector array preferably comprises a matrix comprising charge coupled devices, although, for example, an array of photodiodes may be alternatively employed. The two means elements (4) and (5) may be realized by software comprising a series of Fortran subroutines and functions. An illustrative software program is set forth below in an Example.
The invention is illustrated in the accompanying drawings in which:
FIG. 1 shows a radiation system of the present invention;
FIG. 2 shows a neural network for employment with the method and radiation system of the present invention; and
FIG. 3 shows a multidimensional feature vector space used in a statistical learning algorithm of the present invention.
Attention is now directed to FIG. 1, which shows an optical system 10 of the present invention. The system 10 includes a single point source 12 emitting monochromatic optical radiation. In particular, the source 12 is a continuouswave heliumneon gas laser at 632.8 nm. The system 10 also includes a lens 14 located at the system 10 entrance pupil. The lens 14 functions to image the optical radiation at a far field focal plane 16. The focal plane 16 is located at a plane orthogonal to an optical axis 18 defined by the source 12 and lens 14. A detector array 20 comprising a 64×64 matrix of conventional chargecoupled devices is located at the focal plane 16. The F# of the system 10, as conventionally defined, is approximately 18.
During system 10 operation, the radiation that is imaged by the lens 14 is detected by the detector 20. The output of the detector 20 provides an intensity distribution profile, here given as a point spread function defined by the equation (1)
ρ=ρ(x,y) (1)
The point spread function (1) is inputted to a Microvax II computer 22 along a line 24. The computer 22 may be programmed to modify the point spread function in a manner disclosed immediately below, and in order to define a feature vector and simulate an adaptive computational architecture.
Preferably, the point spread function may be modified in several ways, in preparation for the first step of the method of the invention, namely, defining a feature vector derived from the point spread function. Accordingly, the function ρ may be modified by a weighting function, and may be windowed, in order to account for the fact that the detector array 20 is of limited extent.
As summarized above, the method of the present includes a first step of defining the feature vector. The feature vector, in turn, may be derived from the (modified) point spread function. In particular, the feature vector is preferably defined as one derived by way of another intermediary, namely, a multidimensional moment vector. The moment vector, in turn, subsumes the point spread function as shown mathematically in an equation (2) below.
The usefulness of the moment vector as a feature vector is at least twofold. First, when an aberration type exhibited by the lens 14 is substantially isolated, for example, only hexafoil, the point spread function exhibits certain characteristic symmetries, independent of the amount of the aberration. These symmetries include, for example, invariance to translation, position and scale, etc. This symmetryinvariance fact, in turn, may be exploited by moment vectors, to the end of discriminating among different point spread functions, and thereby identifying a particular aberration. In more complex cases, for example when the lens 14 introduces mixtures of several aberration types, the basic symmetry patterns in the point spread function may be vitiated. Nevertheless, even for the complex cases, moment vectors are useful since they reduce the dimensionality of the recognition problem, i.e., the efficient discrimination among different point spread functions to the end of identifying the salient aberrations. Moreover, the use of moment vectors is preferred, since they positively address important issues including numerical stability of the discrimination estimates, changes in overall system image intensity, amendability to parallel processing, and favorable sensitivity to noise and higher order aberrations.
The moment vectors preferably are defined by the following equation (2):
M.sub.pq =∫∫x.sup.p y.sup.q ρ.sub.n (x,y)dxdy (2)
In accordance with equation (2) and preferred aspects of the method of the invention, a set of n lenses 14, each having known aberrations, may be individually inserted into the optical system 10. This action produces n known point spread functions. The n known point spread functions ρ_{n} (x,y) may be each individually developed by way of the moment vector equation (2), to produce n feature vectors. Thus, for any one known point spread function, for example, ρ_{1} (x,y), a column feature vector FV_{1} can be calculated from equation (2), and of the form: ##EQU1##
In a similar manner, a second known point spread function ρ_{2} (x,y) can be developed by way of equation (2), to form a second column feature vector FV_{2}. This procedure may be repeated to form an exhaustive array of column feature vectors corresponding to n point spread functions. Further instruction on these points is provided by Ming Kuei Hu, "Visual Pattern Recognition By Moment Invariants", IRE Transactions On Information Theory, pp. 179187, February 1962; and Michael Reed Teague, "Image Analysis Via The General Theory Of Moments", J. Opt. Soc. Am., Vol 70., No. 8, August, 1980, pp. 920930.
The elements of each column feature vector (equation 3) correspond to a (p+q) phase space. In this embodiment, the feature vector being used is the list of moments derived from the point spread function. The adaptive computational architecture of the present invention calculates from the list of moments the corresponding sets of aberrations, e.g., hexafoil, or power+coma.
We now summarize the above materials on preferred aspects of the first step of the present method, in preparation for disclosing the details of the second step of the method:
(1) a set of n different lenses 14 having known aberrations may be serially inserted into the optical system 10, to yield n unique point spread functions (Equation 1);
(2) each of the n point spread functions corresponds to a known set of aberrations; and
(3) each known point spread function may be expressed as a unique column feature vector, equation (3).
The first step of the present method, so summarized in its preferred aspects, we segue to the second step which requires employing an adaptive computational architecture for associating the feature vector with at least one identifying characteristic of the imaging device, here, lens 14. A preferred such architecture comprises a threelayered nonlinear neural network, of the type shown in FIG. 2. The FIG. 2 shows a neural network 26 which comprises a plurality of column input nodes 28, for inputting the column feature vector; a plurality of nonlinear column hidden nodes 30, each of which nodes accepts adjustable, weighted input signals w_{ij} from each of the input nodes 28; a plurality of column output nodes 32, each of which output nodes accepts adjustable, weighted input signals w_{kl} from each of the hidden nodes 30; and, finally, a plurality of column target nodes 34 which are individually mated to each of the output nodes 32.
The functioning of the neural network 26 can be explained in overview by understanding its operation during a preliminary "learning" mode, and a subsequent realtime processing mode.
In the preliminary mode, the neural network 26 functions as an "offline" training or learning vehicle, to build up a recognizable "vocabulary" between, on the one hand, known aberration feature vectors as inputs to the column input nodes 28, and on the other hand, known target node 34 aberration parameters. The "training" or "learning" per se is accomplished by iteratively adjusting the weighting factors w_{ij} and w_{kl}, to the end that properly adjusted weighting factors insure that approximate aberration types and magnitudes in the output nodes 32 correspond (within an error criteria) to the known target node 34 aberration parameters. In particular, the neural network 26 works "backward", from w_{kl} to w_{ij}, hence the expression back propagation, to "learn" the required new weights. Moreover, the "learning" is accumulative, in the sense that weights are learned for each of a succession of input feature vectors, and each succession of input feature vectors adds cumulatively to the codified repository of the weighting factors. (In this manner, by the way, the neural network 26 can theoretically "learn" the entire vocabulary of Zernike aberrations).
In the second mode, the neural network 26 provides a realtime processing capability. The neural network 26 accepts an arbitrary input feature vector as an input to the column nodes 28; and computes an output node 32 column vector that characterizes a heretofore unknown intensity distribution profile, e.g., a particular aberration magnitude. Here, in the realtime processing mode, the cumulatively codified weighting factors, built up during the learning mode, provide a basis against which the arbitrary input feature vector may be interrogated.
With this overview of the twofold functioning of the neural network 26 in mind, we now turn to particular details of its operation. Accordingly, in the initial learning mode, a first feature vector corresponding to a known aberration, of the form (FV_{1}) expressed by the column vector equation (3) above, is provided as an input to the column input nodes 28, so that each element of the column vector equation (3) is placed on a onetoone basis with the column input nodes 28. Next, each element placed in the column input nodes 28, is operated upon by the adjustable, weighted function w_{ij}, as preparation for input to the appropriate hidden node column 30. Initially, the weights are preferably an arbitrary scalar, selected randomly in a small interval around zero, e.g., in an interval form 0.01 to 0.01.
The node value B_{j} on the j^{th} second layer 30 node are computed by:
B.sub.i =f(Σw.sub.ij z.sub.j) (4)
where z_{j} is the contents of the first layer 28 node number j (equal to the j^{th} component of the input feature vector, except that z_{0} equals 1), and where f is an "activation function" preferably of the form:
f(x)=1/(1+exp (x)) (5)
The computed values B_{i} represent the output of the hidden node column 30. These values are operated on by the adjustable weights w_{kl} connecting layer 30 with layer 32, in the same manner as above. The second weighting operation produces the output parameters stored in the column output nodes 32.
Since we are in the learning mode, the output parameters are compared to the known aberration parameters stored in the target nodes 34. If the output parameters are equal to the aberration parameters, within an error criteria, than it is understood that the arbitrary selected weights are in fact correct. On the other hand, if the output parameters are not equal to the target aberration parameters, within the error criteria, then it is understood that the arbitrary selected weights are not optimal, and must be adjusted by the backward propagation process, for further evaluation of output parameter/aberration parameter correspondence. This process is continued, that is, new weights are learned, until the indicated correspondence is within the error criteria.
The learning mode is initialized with the processing of a first feature vector FV_{1}. With this step completed, the process may be repeated anew for a second feature vector FV_{2}, and then again repeated anew for FV_{3}, FV_{4}, FV_{n}. Note that each new learning step not only learns, in and of itself, but also carries over the learning built up by all the previous steps. Note further that the learning process can be enhanced by extending the length of the feature vector, by way of equation (3) above, and routinely extending the neural network 26 columns 2834. By enhancement, we mean that an ever larger number of the Zernike aberrations can be learned by the neural network 26.
With the learning mode completed, the method of the present invention operates in a real time processing mode. As indicated above, this includes deriving a feature vector preferably by way of a moment analysis of an arbitrary point spread function. The PSF numerical parameters can be ascertained by way of the detector array 20, but it is presently unknown what, if any aberrations may have been introduced into the optical system 10 by way of the lens 14, and thereby become embedded in the PSF.
According to preferred aspects of the method, the aberrations may be ascertained by expanding the arbitrary PSF by way of the moment equation (2) above. This results in a feature vector of the form of equation (3) above. The prepared feature vector next becomes an input to the neural network 26. The neural network 26 processes the instant feature vector in a manner entirely analogous to the processing of a known feature vector in the training mode. One difference, however, between the two processing procedures, is that in the processing of the instant feature vector, the output parameter located in the column vector 32 nodes characterizes the sought for information on what aberration may be embedded in the point spread function. Accordingly, no present use is made of the plurality of column target nodes 34.
It is disclosed above that the adaptive computational architecture, of which the FIG. 2 neural network 26 is an important type, may also be realized by a statistical learning algorithm. We now turn our attention to this second type of architecture with the following overview.
The statistical learning algorithm (SLA) shares with the neural net approach the preferred employment of feature vectors derived from the moment equation (2). It differs from the neural network approach in that it purports to classify an arbitrary feature vector by whether or not it falls into known discrete regions in a multidimensional space. On the one hand, if the arbitrary feature vector is determined to be located in a particular region, then the sought for information on how the arbitrary feature vector is to be characterized, becomes, in fact, known. On the other hand, if the arbitrary feature vector is determined to fall outside a region, statistical criteria provided below may be used to determine an estimate as to which region the arbitrary feature vector is best associated. The statistical learning algorithm "learns" by giving ever better definition to the boundaries or envelope of each of the discrete regions. With this overview of the statistical learning algorithm in mind, we now elaborate firstly on a preliminary learning mode, and secondly on a subsequent realtime processing mode.
To this end, attention is directed to FIG. 3, which shows a multidimensional M_{pq} feature vector space. In particular, each axis in the multidimensional space is dedicated to a moment parameter defined by way of Equations (2) and (3) above. FIG. 3 also shows a number of discrete and independent regions R_{i} embedded in the multidimensional M_{pq} feature vector space. Each region is dedicated to a selected Zernike aberration, for example, power, or coma, or astigmatism, or combinations of Zernike aberrations, like trefoil and astigmatism.
The learning mode, as indicated above, includes giving ever better definition to the boundaries or envelope of each of the discrete regions R_{i}. This, in turn, may be effected in the following way. First, a known imaging device, say a lens 14 of the optical system 10 of FIG. 1, is inserted into the optical system 10 to produce a known point spread function, e.g., a PSF corresponding to power. Then, the known point spread function is developed by way of Equations (2) and (3) above, to produce a known feature vector FV_{p} located in the multidimensional space M_{pq}. The tip of this feature vector FV_{p} locates a power region. The boundaries or envelope of the power region are given ever greater definition by repeating the immediately foregoing analysis, but this time, by adding noise components to the moment analysis (shown as asterisks in FIG. 3). Statistical algorithms may be advantageously employed to deduce appropriate noise components. The addition of the included noise components around the known feature vector FV_{p}, effects the further articulation or definition of the power region. It is to be noted that the region is defined as to aberration, as well as to magnitude, e.g., aberration=power; magnitude=0.50 wave.
The learning mode may be continued by essentially repeating the above procedure, but this time, inserting a second lens into the optical system 10, to produce a second known point spread function, e.g., a PSF corresponding to astigmatism. Briefly, the repeat procedure involves developing a second feature vector FV_{A} by way of Equations (2) and (3) above, to locate an astigmatism region. The boundaries or envelope of the astigmatism region are given ever greater definition by the statistical noise procedures outlined above. This results, finally, in an astigmatism region defined as to content, as well as magnitude.
The learning mode may be further continued in this manner, mutatis mutandis, to (theoretically) learn any desired set pof Zernike aberrations which are set off by independent and discrete regions, including combinations of aberrations and their magnitudes.
The statistical learning algorithm, in its subsequent realtime processing mode, works in the following way. An unknown imaging device, e.g., a lens of FIG. 1, is inserted into the optical system 10 to produce an arbitrary point spread function. As in the case of the neural network 26 above, the point spread function numerical parameters can be ascertained by way of the detector array 20, but it is presently unknown what, if any aberrations may have been introduced into the optical system 10 by way of the instant lens, and thereby become embedded in the PSF.
According to preferred aspects of the method, the aberrations may be ascertained by expanding the arbitrary PSF by way of the moment equations (2) and (3) above. This results in a feature vector located in the multidimensional feature vector space M_{pq}. If the feature vector is located in a particular region defined by the above learning mode, then the sought for information on how the instant feature vector is to be characterized, becomes an ascertained fact. On the other hand, if the instant feature vector falls outside of the learned regions, the following preferred statistical criteria may be used to determine which region the instant feature vector is best associated.
For example, consider the instant feature vector FV_{I} in FIG. 3. We are required to define a probability model for describing the learned regions, and for associating the instant feature vector FV_{I} with respect to the learned regions.
A preferred probability model (1) uses Bayesian inference to assess the probability that the instant feature vector belongs to a particular region; (2) assumes that the individual regions are describable by multivariate Gaussian functions; and (3) assumes that the "prior probabilities" of the region classes are equal. These three points are now developed.
Assume that the regions are describable by welldefined probability density functions p_{j} (f), which give the conditional probability that the feature vector of an object is f, given that the object belongs to class k_{j}.
Then, given that an object has feature vector f, an approach to the classification problem is to determine the probability that the object belongs to each class k_{j}, by using Bayes'"'"'s law. The resulting probabilities can then be compared, to determine the most likely classification of the object.
The application of Bayes'"'"'s law requires that prior probabilities pr_{j} exist, giving the distribution of objects among the classes, unconditioned by knowledge of the feature vectors of the objects.
In more detail, the conditional probability p(jf) that the object is in class k_{j}, given that it has feature vector f, is given by:
p(jf)=p.sub.j (f)pr.sub.j /(p.sub.1 (f)pr.sub.1 +p.sub.2 (f)pr.sub.2 + . . . p.sub.k (f)pr.sub.k)
using Bayes'"'"'s law, where pr_{j} is the prior probability that the object falls within class k_{j}.
The application of this procedure to classifying an unknown object 0 is to determine its feature vector f, determine the associated conditional probability density values p_{j} (f) (by substitution in the analytical form for p_{j}), use the formula above to compute the p(jf)'"'"'s, and compare these answers with each other to determine the most likely classification of 0.
To carry through this procedure, the conditional probability densities p_{j} must be known, or at least approximated. The conditional probabilities p_{j} specify the probability distribution associated with the class k_{j} ; these determine the shape of the cluster or region associated with k_{j} in the feature vector space.
A very common procedure is to approximate each p_{j} as a multidimensional Gaussian distribution; that is, to assume the form:
p.sub.j (f)=(2π).sup.d/2 Σ.sup.1/2 exp [(1/2)(fμ.sub.j).sup.t Σ.sub.j.sup.1 (fμ.sub.j)],
where μ_{j} and Σ_{j} are the (vector) mean and covariance matrix of the distribution and d is the dimension of the feature vector space. Here Σ_{j}  is the determinant of Σ_{j}, and a superscripted t donates the transpose of a (column) vector. (By definition μ_{j} is the expectation value of f, and Σ_{j} is the expectation value of (fμ_{j})(fμ_{j})^{t}, for f drawn from the distribution.) Thus, the approximation of p_{j} as a multivariate Gaussian permits it to be specified by the set of p components of μ_{j} plus the set of p(p+1)/2 independent components of Σ_{j}.
This assumption models each region as having an ellipsoid shape in feature vector space. The eigenvectors of the covariance matrix correspond to the principal axes of the ellipsoid. The shape and orientation of the ellipsoids are in this way determined from the covariance matrix, while the mean vector of the region determines the position of the ellipsoid.
Use of the Bayes'"'"'s rule requires, in addition to the conditional probability densities, a specification of the prior probabilities pr_{j}. The simplest assumption is that all the prior probabilities are equal with pr_{j} =1/k. Performance can, of course, be improved if more accurate estimates of the prior probabilities are known. In the current application, the equality of prior probabilities is assumed.
The conditional probability p(jf) is precisely the probability that the class associated with a specified feature vector f is k_{j}. Once the means, covariance matrices, and prior probabilities are specified, the reasoning given in the preceding section shows that this probability is easily computable. This conditional probability p(jf) can be thought of as a discriminant function α_{j} (f) associated with the class k_{j}, in that the α_{j} (f) that is maximal determines the classification of f.
In practice it is usually convenient to apply some monotonic function to p(jf) to define the discriminant function. In particular, by taking the natural logarithm of p(jf) the exponential in the expression for p_{j} (f) can be avoided. Also, since the denominator of the expression for p(jf) is independent of j the denominator can be ignored in defining the discriminant function. Thus, a suitable discriminant function is defined as:
α.sub.j (f)=ln (p.sub.j (f))+ln (pr.sub.j),
and, if all the prior probabilities are assumed to be equal, even the ln (pr_{j}) term can be eliminated, giving α_{j} (f)=ln (p_{j} (f)), or:
α.sub.j (f)=ln ((2π).sup.d/2 Σ.sub.j .sup.1/2)(1/2)(fm.sub.j).sup.t Σ.sub.j.sup.1 (fμ.sub.j),
This form has the advantage of eliminating some unnecessary calculations, and of avoiding the danger of floating point underflows in the classification calculation.
This Example includes a listing of a multidimensional momentbased classification software program developed in accordance with the method of the present invention. The software consisted of a series of FORTRAN subroutines and functions. The software made use of a set of 17 data files, which are described here as well.
The routines were assumed to be called from a general main program. All communication with the supplied subroutines was through the argument list of the single subroutine CLASSIFICATION, presented here first, which calls other subprograms as needed.
CLASSIFICATION takes six arguments, as follows:
IMAGE_{} ARRAY_{} INTa SIZE by SIZE 4byte integer array containing the point spread function to be analyzed.
DX=distance in inches between detector elements (the spacing is here assumed to be the same in both directions). The nominal value for DX is 2.6672×10^{4} (inches per sample).
WAVELENGTHwavelength, in inches. The nominal value in nanometers is 632.8 nm, which converts to 2.4920×10^{5} inches.
FNUMF number of the system. The nominal value is 18.0.
SIZElinear dimension of IMAGE_{13} ARRAY INT. The nominal value is 64.
LUNan available FORTRAN logical unit number.
The logical unit number was provided in order that the program could read the supplied data files. These data files contained the statistics that describe the multidimensional regions (clusters). (The advantage of providing this data in data files rather than hardcoded in the software was that the number and type of aberration classes could be easily modified by changing the data files (without the necessity of modifying the code). CLASSIFICATION currently contains a hardcoded reference to the first of these data files, CASES.DAT, which contains a list of the names of the other data files used.
The data files were as follows:
CASES.DATNumber of classes, and names of the data files (containing the class statistics) to be read by the software.
CASE_{} 0.DATStatistics for diffraction limited point spread function.
CASE_{} 1.DATStatistics for 1 wave Trefoil.
CASE_{} 2.DATStatistics for 0.25 wave Astigmatism.
CASE_{} 3.DATStatistics for 0.50 wave astigmatism.
CASE_{} 4.DATStatistics for 0.75 wave Astigmatism.
CASE_{} 5.DATStatistics for 0.25 wave Coma.
CASE_{} 6.DATStatistics for 0.50 wave Coma.
CASE_{} 7.DATStatistics for 0.75 wave Coma.
CASE_{} 8.DATStatistics for 0.50 wave Trefoil with 0.50 wave Astigmatism.
CASE_{} 9.DATStatistics for 0.50 wave Trefoil with 0.75 wave Astigmatism
CASE_{} 10.DATStatistics for 0.50 wave Coma with 0.50 wave Astigmatism.
CASE_{} 11.DATStatistics for 0.50 wave Coma with 0.75 wave Astigmatism.
CASE_{} 12.DATStatistics for 0.50 wave Trefoil with 0.50 wave Coma.
CASE_{} 13.DATStatistics for 0.50 wave Trefoil with 0.75 wave Coma.
CASE_{} 14.DATStatistics for 0.75 wave Trefoil with 0.75 wave Coma.
CASE_{} 15.DATStatistics for 0.75 wave Astigmatism with 0.75 wave Trefoil.
Each of the data files CASE_{} 0.DAT through CASE_{} 15.DAT was prepared from a set of simulated sample point spread functions, with added random defocus and noise. Each data file contained a one line description of the aberration type, the length of the feature vector being used (12), a list of the mean values of each feature, the determinant of the covariance matrix, and the elements of the (12 by 12) inverse covariance matrix. Because the first invariant moment would be identically 1 by definition (because of the normalization), the first invariant moment was replaced with the entropic size, normalized by dividing by the wavelength times the F number.
The subroutine and functions listed here were all included in a single file, CLASSIFIER.FOR, on a tape. The data files were not listed here, but were included on the same tape, with names and content as described above. ##SPC1##
When trained against eight mixture cases with random defocus (standard deviation of 1/20 wave) and small amounts of other aberrations, the method of the present invention correctly classified 294 of 296 cases.