METHOD AND DEVICE FOR PROCESSING DATA
1. . A data processing device configured to process measurement spectral data by using a multivariate analysis, the data processing device comprising:
- a determination unit configured to, based on the measurement spectral data, determine sampling intervals to be used in the multivariate analysis;
a data group obtaining unit configured to obtain a spectral data group of the sampling intervals determined by the determination unit; and
a multivariate analysis unit configured to carry out the multivariate analysis by using the spectral data group obtained by the data group obtaining unit.
High-speed data processing is achieved by measuring spectral data using a multivariate analysis. This is accomplished by a determining sampling intervals or sampling data to be used in the multivariate analysis, obtaining a spectral data group of the determined sampling intervals, and carrying out the multivariate analysis using the obtained spectral data group.
|Automatic peak identification method|
Patent #US 20060217938A1
Current AssigneeCollege of William Mary
Sponsoring EntityCollege of William Mary
|Detection and classification system for analyzing deterministic properties of data using correlation parameters|
Patent #US 6,732,064 B1
Current AssigneeAdobe Systems Incorporated
Sponsoring EntityNonlinear Solutions Inc.
|Method of processing and correcting spectral data in two-dimensional representation|
Patent #US 6,154,708 A
Current AssigneeKurashiki Boseki Kabushiki Kaisha
Sponsoring EntityKurashiki Boseki Kabushiki Kaisha
|Classification of Biological Samples Using Spectroscopic Analysis|
Patent #US 20120016818A1
Current AssigneeUniversity of Sydney
Sponsoring EntityUniversity of Sydney
|CONTINUOUSLY UPDATING FOURIER COEFFICIENTS EVERY SAMPLING INTERVAL|
Patent #US 3,778,606 A
Current AssigneeSanders Associates INC.
Sponsoring EntitySanders Associates INC.
- 1. . A data processing device configured to process measurement spectral data by using a multivariate analysis, the data processing device comprising:
a determination unit configured to, based on the measurement spectral data, determine sampling intervals to be used in the multivariate analysis; a data group obtaining unit configured to obtain a spectral data group of the sampling intervals determined by the determination unit; and a multivariate analysis unit configured to carry out the multivariate analysis by using the spectral data group obtained by the data group obtaining unit.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- 11. . A data processing method for processing measurement spectral data by using a multivariate analysis, the data processing method comprising:
determining, based on the measurement spectral data, sampling intervals to be used in the multivariate analysis; obtaining a spectral data group of the determined sampling intervals; and carrying out the multivariate analysis by using the obtained spectral data group.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
- 20. . A computer readable storage medium storing a program that causes a computer to execute:
determining, based on measurement spectral data, sampling intervals to be used in a multivariate analysis; obtaining a spectral data group of the determined sampling intervals; and carrying out the multivariate analysis by using the obtained spectral data group.
Aspects of the present invention generally relate to methods and devices for processing measurement spectral data obtained by measuring biological tissue, and in particular relates to a method and a device for processing image data for a multivariate analysis.
2. Description of the Related Art
Conventionally, biological tissue has been observed with a microscope, and constituent substances or contained substances associated with the observed biological tissue have been visualized. For such visualization, mass spectrometry or Raman spectroscopy is employed. As a measurement spectrum, a mass spectrum, an ultraviolet, visible, or infrared optical spectrum, and so on are used. With such a measuring method, information on a spatial distribution of peak values in the measurement spectrum associated with the measured substance can be obtained, and thus a spatial distribution of the substance contained in the biological tissue associated with the measurement spectrum can be obtained.
With mass spectrometry, the time of flight of an electrically charged ion depends on mass m of the ion and an electric charge z. On the basis of the above, the ion can be identified, and a mass spectrum at each point on the sample can be obtained.
With Raman spectroscopy, a light source irradiates a substance with monochromatic laser light, and generated Raman scattered light is detected with a spectrometer or an interferometer so as to obtain a Raman spectrum. A difference between the frequency of the Raman scattered light and the frequency of the incident light (i.e., Raman shift) takes a value unique to the structure of the substance, and thus a Raman spectrum unique to the measured substance can be obtained.
To date, a multivariate analysis, in which intensity information of a broad wavelength band is handled as a variate, has been employed to analyze measurement spectral data. According to a principal component analysis (PCA) or an independent component analysis (ICA), which are types of the multivariate analysis, even with a complicated spectrum in which vibration spectra or band structures of components contained in a biological sample are superimposed on one another, the chemical state of the biological sample can be classified and measured. As an example, according to Japanese Patent Laid-Open No. 2011-174906, a PCA is carried out on an optical spectrum of each pixel to obtain a distribution of principal component scores, and thus morphologic information or composition of a biological sample is examined.
When a PCA is carried out, a sample variance-covariance matrix is obtained, and an eigenvalue and an eigenvector of the sample variance-covariance matrix are then obtained. A sample variance-covariance matrix, however, contains data having a size of a spectral number by a spectral number. Thus, when a spectral number used in an analysis is large or when a large number of pieces of image data are to be handled, the data amount increases, disadvantageously leading to an increased processing time.
Aspects of the present invention are generally directed to providing a method and a device that enable high-speed data processing by resampling a spectrum while retaining necessary information.
According to an aspect of the present invention, a data processing device is configured to process measurement spectral data by using a multivariate analysis. The data processing device includes a determination unit configured to, based on the measurement spectral data, determine sampling intervals or sampling data to be used in the multivariate analysis, a data group obtaining unit configured to obtain a spectral data group of the sampling intervals determined by the determination unit or a selected spectral data group, and a multivariate analysis unit configured to carry out the multivariate analysis by using the spectral data group obtained by the data group obtaining unit.
According to the present invention, a multivariate analysis, as represented by a PCA, can be carried out quickly by reducing a spectral number while retaining necessary information. Reducing the spectral number to be used in measurement in turn allows the measurement time to be reduced as well.
Further features of the present disclosure will become apparent from the following description of exemplary embodiments with reference to the attached drawings.
Hereinafter, an exemplary embodiment will be described in detail with reference to the flowcharts and the drawings. It should be noted that the specific example described below is merely an example and is not seen to be limiting. In the exemplary embodiment, a sample having a composition distribution within a space is measured, but additional exemplary embodiments are applicable to a result obtained through any method as long as such a method obtains measurement spectrum information associated with biological tissue or a substance distributed within biological tissue of a lesion, in correspondence with positional information on each point within the space and the position of each point.
In the exemplary embodiment, measurement data is first obtained.
Measurement data to be obtained is, for example, data on a measurement spectrum, and the measurement spectrum may be obtained through spectroscopy using an ultraviolet, visible, or infrared optical spectrum or through Raman spectroscopy using a Raman optical spectrum, or may be mass spectral data. A measurement spectrum obtained through spectroscopy or Raman spectroscopy has a measurement signal such as the one illustrated in
In a case in which mass spectral data such as the one illustrated in
Subsequently, in step S101 of
Data that can be used in the exemplary embodiment includes not only two-dimensional image data but also three-dimensional spatial data. In a case in which information on the component A in a Z direction relative to the XY plane can be obtained, the measurement data can be used as information on the component A relative to an XYZ space, or in other words, as four-dimensional information.
Although, for the sake of simplicity, data processing method in which two-dimensional information along the XY plane will be described in detail hereinafter, a processing method in which information on the Z direction is added can be implemented in a similar manner.
In step S102 of
Specifically, the stated resampling includes a determination operation in which sampling intervals to be used in a multivariate analysis is determined on the basis of the measurement spectral data and a data group obtaining operation in which a spectral data group of the determined sampling intervals is obtained. The stated resampling further includes a determination operation in which a spectrum to be used in the multivariate analysis is determined on the basis of the measurement spectral data and a data group obtaining operation in which a spectral data group selected by the determination unit is obtained.
In step S201 of
The sampling intervals may be determined, for example, through a method that uses (1) a rate of change (second derivative) of a spectral distribution or a method that uses (2) intensity information in a frequency space. Furthermore, a spectrum to be resampled may be selected through a method that uses (3) the magnitude of the spectral intensity distribution or (4) the magnitude of the Mahalanobis distance. Hereinafter, each case will be described.
(1) Method that Uses a Rate of Change (Second Derivative) of a Spectral Distribution
(2) Method that Uses Intensity Information in a Frequency Space
In a case in which intensity information in a frequency space is used, for example, after a spectrum is subjected to Fourier transform, a power spectrum is calculated, and a frequency to be used preferentially may be determined on the basis of the order of spectral intensities. By determining, in advance, a spectral number or an intensity threshold to be used, the sampling intervals can be calculated automatically.
(3) Method that Uses the Magnitude of the Spectral Intensity Distribution
In addition, when selecting a spectrum to be resampled, a spectrum may be selected with a focus on the magnitude of the spectral intensity distribution. For example, in the PCA, which is one of the methods for the multivariate analysis, an axis along which the distribution of a data projection component is maximized is selected as an axis along which the data is to be contracted. Thus, by preferentially selecting a spectral component with a greater distribution, a dominant spectral component in the result of the PCA can be selected.
(4) Method that Uses the Mahalanobis Distance
As another method for selecting a spectrum, the magnitude of the Mahalanobis distance may be used. The Mahalanobis distance is defined by a ratio between the between-groups variance and the within-group variance of spectral intensities corresponding to a plurality of measurement targets. If the Mahalanobis distance is large when the spectral intensities corresponding to the measurement targets are projected onto a multi-dimensional space, the components can be efficiently obtained and separated, for example, when the PCA is carried out. As a result, a dominant spectral component in the result of the PCA can be selected.
In step S202, a spectral data group of the sampling intervals determined in step S201 is obtained.
The data group may be obtained, for example, by averaging successive data points within a data array to generate a new data point, or in other word, by recalculating a spectral distribution.
Alternatively, measurement based on the determined sampling intervals may be newly carried out to obtain a data group. Specifically, previously obtained data may be subjected to spectrum resampling, and data can be newly obtained by using the obtained sampling intervals. Through this, the time it takes for the measurement can be reduced. As another alternative, the spectrum resampling can be carried out by simultaneously using data on a plurality of pixels around a pixel of interest.
In step S103 of
For example, when the PCA is carried out, an eigenvalue and an eigenvector of a sample variance-covariance matrix having a size of a spectral number by a spectral number need to be calculated. Reducing the spectral number in step S102, however, makes it possible to greatly reduce the operation amount.
The present exemplary embodiment can be realized by a device that implements the specific example described above.
A data processing device 6 carries out the above-described processing on an obtained signal, and an image display device 7 displays, on its screen, the result of the signal processing. In other words, the sample information obtaining system includes the measuring unit, the data processing device 6, and the image display device 7.
Hereinafter, a first exemplary embodiment will be described. In the first exemplary embodiment, mouse pancreatic tissue is observed by using a microscope that utilizes stimulated Raman scattering. The power of a TiS laser used as a light source is 111 mW, and the intensity of a Yb fiber laser is 127 mW prior to being incident on an objective lens. The mouse pancreatic tissue serving as a sample is subjected to formalin fixation processing and sliced to a thickness of 100 micrometers. The tissue section is measured in a state in which the tissue section is embedded in glass along with a PBS buffer. The measurement area measures 160 micrometers on each side, and ten pieces of measurement data are integrated. The image data measures 500 pixels on each side, and the measurement time is 30 seconds.
On obtained spectral image data, XY coordinate information indicating a position of each measurement pixel and spectral information at each coordinate are recorded. For example, the spectral image data contains, as spectral data, information on a peak component associated with a component in the tissue contained in the sample.
In each of the cases, the measurement is carried out at spectral data sampling intervals of 1 kayser (1 cm−1).
Subsequently, the spectral image data is subjected to spectrum resampling so as to be in proportion to the rate of change (second derivative) of the spectral distribution.
This resampling reduces the spectral number to approximately ⅔ and the time it takes for the PCA to approximately ½.
As can be seen when sections (a), (b), and (c) of
On the contrary, in a case in which the sampling intervals are set to 1.5 kayser, principal component images illustrated in sections (a″), (b″), and (c″) of
In other words, it is indicated that the method of this exemplary embodiment enables the multivariate analysis to be carried out quickly while maintaining necessary information.
In addition, in this spectrum resampling, it is also possible to measure only a selected spectral component (referred to as premeasurement) and to obtain a first stage image. In this case, the entire spectra are measured (referred to as main measurement) after an approximate image is obtained through the premeasurement, and a final image is thus obtained. In addition, the entire wave numbers may be measured in a limited area of the entire image area in the premeasurement, and a necessary wave number may be selected on the basis of the spectral information of the stated area. In this case, while an area different from the area on which the premeasurement is carried out is measured in the main measurement, the measurement/analysis is carried out only for the wave number determined in the premeasurement, or in other words, the thinned wave number. Through this, the time it takes for the measurement and the analysis can be reduced.
The above-described exemplary embodiment(s) can be used as a tool that more effectively supports a multivariate analysis of spectral image data.
While exemplary embodiments have been described thus far, these exemplary embodiments are not seen to be limiting, and various modifications and changes can be made within the scope of the present disclosure.
Exemplary embodiments can be implemented, for example, in the form of a system, an apparatus, a method, a program, or a storage medium. The exemplary embodiments have been applied to the sample information obtaining system that includes the data processing device 6 and the image display device 7. The exemplary embodiments, however, may be applied to a system that is constituted by combinations of other devices or to an apparatus constituted by a single device.
In a system that is constituted by combinations of a plurality of devices to which the exemplary embodiments are applied, some or all of the devices may be interconnected through a network, such as the Internet. For example, obtained data may be transmitted to a server connected to a network, and the processing according to the exemplary embodiments may be carried out on the server. The obtained result may then be received from the server to display an image.
Furthermore, a software program according to the exemplary embodiments may be directly or remotely supplied to a system or an apparatus, and a computer of the system or the apparatus may load and execute the supplied program codes to realize the functions of the exemplary embodiments. In this case, the program to be supplied is a computer program corresponding to the flowcharts indicated in the exemplary embodiments. Thus, a program code itself installed in the computer to realize the functional processing of the exemplary embodiments also realizes the present disclosure.
In other words, the exemplary embodiments encompass a computer program that realizes the functional processing of the present disclosure. In that case, as long as being provided with the function of a program, an object code, a program executed by an interpreter, script data to be supplied to an operating system (OS) may be included.
A computer readable storage medium for supplying a computer program may, for example, be a hard disk, an optical disc, an magneto-optical disk, an MO, a CD-ROM, a CD-R, a CD-RW, or a magnetic tape. Furthermore, the storage medium may be a non-volatile memory card, a ROM, or a DVD (DVD-ROM, DVD-R).
Additionally, the program may be supplied by accessing a webpage through the Internet by using a browser on a client computer and downloading the computer program of the present disclosure from the webpage to a storage medium, such as a hard disk. In this case, the program to be downloaded may be in the form of a compressed file containing an automatic installation function. Furthermore, the present invention encompasses a WWW server that allows a plurality of users to download the program file that causes a computer to realize the functional processing of the present disclosure.
While the present disclosure has been described with reference to exemplary embodiments, it is to be understood that these exemplary embodiments are not seen to be limiting. The scope of the following claims is to be accorded the broadest interpretation so as to encompass all such modifications and equivalent structures and functions.
This application claims the benefit of Japanese Patent Application No. 2013-115683, filed May 31, 2013, which is hereby incorporated by reference herein in its entirety.