Automated system and method for spectroscopic analysis
First Claim
1. An automated method for modeling spectral data, the spectral data generated by one of diffuse reflectance, clear transmission, or diffuse transmission, comprising the steps ofaccessing a set of spectral data, the set of spectral data including, corresponding spectral data for each of a plurality of samples, the spectral data for each of the plurality of samples having associated therewith at least one constituent value, the at least one constituent value being a reference value for a target substance in the sample which is measured by a independent measurement technique;
- dividing the set of spectral data, with its associated constituent values, into a calibration sub-set and a validation sub-set;
applying a plurality of data transforms to the calibration subset and the validation subset to generate, for each sample, a set of transformed and untransformed calibration data;
applying one or more of a partial least squares, a principal component regression, a neural net, or a multiple linear regression analysis on the transformed and untransformed calibration data sub-sets to obtain corresponding modeling equations for predicting the amount of the target substance in a sample;
identifying a best modeling equation as a function of the correlation between the spectral data in the validation sub-set and the corresponding constituent values in the validation sub-set.
2 Assignments
0 Petitions
Accused Products
Abstract
An automated method for modeling spectral data is provided, wherein the spectral data generated by one of diffuse reflectance, clear transmission, or diffuse transmission. The method includes accessing a set of spectral data, the set of spectral data including, corresponding spectral data for each of a plurality of samples, the spectral data for each of the plurality of samples having associated therewith at least one constituent value, the at least one constituent value being a reference value for a target substance in the sample which is measured by a independent measurement technique. A plurality of data transforms are applied to the set of spectral data to generate, for each sample, a set of transformed and untransformed spectral data. The set of transformed and untransformed spectral data, with its associated constituent values, is divided into a calibration sub-set and a validation sub-set, and one or more of a partial least squares, a principal component regression, a neural net, or a multiple linear regression analysis is applied to the transformed and untransformed calibration data sub-sets to obtain corresponding modeling equations for predicting the amount of the target substance in a sample. The modeling equation which provides the best correlation between the spectral data in the validation sub-set and the corresponding constituent values in the validation sub-set is identified, preferably as a function of the SEE and SEP.
-
Citations
67 Claims
-
1. An automated method for modeling spectral data, the spectral data generated by one of diffuse reflectance, clear transmission, or diffuse transmission, comprising the steps of
accessing a set of spectral data, the set of spectral data including, corresponding spectral data for each of a plurality of samples, the spectral data for each of the plurality of samples having associated therewith at least one constituent value, the at least one constituent value being a reference value for a target substance in the sample which is measured by a independent measurement technique; -
dividing the set of spectral data, with its associated constituent values, into a calibration sub-set and a validation sub-set;
applying a plurality of data transforms to the calibration subset and the validation subset to generate, for each sample, a set of transformed and untransformed calibration data;
applying one or more of a partial least squares, a principal component regression, a neural net, or a multiple linear regression analysis on the transformed and untransformed calibration data sub-sets to obtain corresponding modeling equations for predicting the amount of the target substance in a sample;
identifying a best modeling equation as a function of the correlation between the spectral data in the validation sub-set and the corresponding constituent values in the validation sub-set. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 67)
(a) a normalization of the spectral data; and
(b) a smoothing transform, a Savitsky-Golay first derivative, or a Savitsky-Golay second derivative of the spectral data.
-
-
9. The method of claim 1, wherein the data transforms include performing:
-
(a) a first derivative of the spectral data; and
(b) a normalization, a multiplicative scatter correction, or a smoothing transform on the spectral data.
-
-
10. The method of claim 1, wherein the data transforms include two or more of performing a baseline correction, performing a normalization of the spectral data, performing a first derivative on the spectral data, performing a second derivative on the spectral data, performing a multiplicative scatter correction on the spectral data, performing smoothing transform on the spectral data, performing a Kubelka-Munk function on the spectral data, performing a ratio on the spectral data, performing a Savitsky-Golay first derivative, performing a Savitsky-Golay second derivative, performing a mean-centering, and performing a conversion from reflectance/transmittance to absorbance.
-
11. The method of claim 10, wherein the data transforms are applied singularly and two-at-a-time.
-
12. The method of claim 2, wherein the baseline correction is combined with each of the normalization, the Kubelka-Munk function, the smoothing transform or conversion from reflectance/transmittance to absorbance;
- the normalization is combined with each of the baseline correction, conversion from reflectance/transmittance to absorbance, first derivative, second derivative, Kubelka-Munk function, smoothing transform, Savitsky-Golay first derivative, or Savitsky-Golay second derivative;
the first derivative transform is combined with each of the baseline correction, normalization, smoothing transform, multiplicative scatter correction, Kubelka-Munk function, or conversion from reflectance/transmittance to absorbance;
the second derivative transform is combined with each of the baseline correction, normalization, smoothing transform, the multiplicative scatter correction, or the Kubelka-Munk function;
the multiplicative scatter correction is combined with each of the Kubelka-Munk function, smoothing transform and conversion from reflectance/transmittance to absorbance;
the Kubelka-Munk function is combined with each of multiplicative scatter correction and smoothing transform;
the smoothing transform is combined with each of the baseline correction, conversion from reflectance/transmittance to absorbance, normalization, first derivative, second derivative, multiplicative scatter correction, and Kubelka-Munk function;
the Savitsky-Golay first derivative is combined with each of the baseline correction, normalization, multiplicative scatter correction, the Kubelka-Munk function, the smoothing transform, and conversion from reflectance/transmittance to absorbance;
the Savitsky-Golay second derivative is combined with each of the baseline correction, normalization transform, the multiplicative scatter correction, the Kubelka-Munk function, smoothing transform, and conversion from reflectance/transmittance to absorbance; and
the conversion from reflectance/transmittance to absorbance is combined with each of the baseline correction, normalization, the multiplicative scatter correction, and smoothing transform.
- the normalization is combined with each of the baseline correction, conversion from reflectance/transmittance to absorbance, first derivative, second derivative, Kubelka-Munk function, smoothing transform, Savitsky-Golay first derivative, or Savitsky-Golay second derivative;
-
13. The method of claim 1, wherein the data transform is a ratio comprising a denominator and a numerator and said numerator comprises a baseline correction, normalization, multiplicative scatter correction, smoothing transform, or Kubelka-Munk function, when the denominator comprises baseline correction;
- the numerator comprises normalization when the denominator comprises normalization;
the numerator comprises first derivative when the denominator comprises first derivative, the numerator comprises second derivative when the denominator comprises second derivative;
the numerator comprises multiplicative scatter correction when the denominator comprises multiplicative scatter correction;
the numerator comprises Kubelka-Munk function when the denominator comprises Kubelka-Munk function;
the numerator comprises smoothing transform when the denominator comprises smoothing transform;
the numerator comprises Savitsky-Golay first derivative when the denominator comprises Savitsky-Golay first derivative;
the numerator comprises Savitsky-Golay second derivative when the denominator comprises Savitsky-Golay second derivative.
- the numerator comprises normalization when the denominator comprises normalization;
-
14. The method of claim 1, wherein the identifying step includes identifying a best mode equation as a function of the standard error of estimate SEP of the validation data.
-
15. The method of claim 1, wherein the identifying step includes identifying a best mode equation as a function of standard error of estimate SEE of the calibration data and the Standard Error of Estimate SEP of the validation data.
-
16. The method of claim 1, wherein the identifying step includes a best mode equation as a function of a weighted average of standard error of estimate SEE of the calibration data and the Standard Error of Estimate SEP of the validation data.
-
17. The method of claim 1, wherein the identifying step includes calculating a figure of merit (FOM) for each modeling equation, the FOM being defined as:
-
18. The method of claim 1, wherein the identifying step includes calculating a figure of merit (FOM) for each modeling equation, the FOM being defined as:
-
67. The method of claim 1, wherein the data comprises samples generated by biological processes including blood samples used in predicting clinical chemistry parameters such as blood glucose levels.
-
19. A method for generating a modeling equation is provided comprising the steps of
(a) operating an instrument so as to generate and store a spectral data set of diffuse reflectance, clear transmission, or diffuse transmission spectrum data points over a selected wavelength range, the spectral data set including spectral data for a plurality of samples; -
(b) generating and storing a constituent value for each of the plurality of samples, the constituent value being indicative of an amount of a target substance in its corresponding sample;
(c) dividing the spectral data set into a calibration subset and a validation subset;
(d) transforming the spectral data in the calibration subset and the validation subset by applying a plurality of a first mathematical functions to the calibration subset and the validation subset to obtain a plurality of transformed validation data subsets and a plurality of transformed calibration data subsets;
(e) resolving each transformed calibration data subset in step (d) by at least one of a second mathematical function to generate a plurality of modeling equations; and
(f) identifying the modeling equation which provides the best correlation between the spectral data in the validation subset and the corresponding constituent values in the validation subset. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31)
-
-
24. The method of claim 19, wherein the identifying step includes calculating a figure of merit (FOM) for each modeling equation, the FOM being defined as:
-
25. The method of claim 19, wherein the instrument is a spectrophotometer, a spectral detector receptive of spectra from the spectrophotometer, a data station receptive of transmittance spectra from the detector.
-
26. The method of claim 19, wherein the set of spectral data comprises one of a set of natural product spectroscopic data, process development spectroscopic data, and a raw material spectroscopic data.
-
27. The method of claim 19, wherein step (e) comprises resolving each transformed calibration data sub-set in step (d) by a partial least squares, a principal component regression, and a multiple linear regression analysis to generate a plurality of modeling equations.
-
28. The method of claim 19, wherein the at least one second mathematical function includes one or more of a partial least squares, a principal component regression, a neural network, and a multiple linear regression analysis.
-
29. The method of claim 19, wherein the first set of mathematical functions include performing a normalization of the spectral data, performing a first derivative on the spectral data, performing a second derivative on the spectral data, performing a multiplicative scatter correction on the spectral data, performing smoothing transform on the spectral data, converting conversion from reflectance/transmittance to absorbance, performing a Kubelka-Munk function on the spectral data, performing a Savitsky-Golay first derivative, and performing a Savitsky-Golay second derivative.
-
30. The method of claim 29, wherein the first set of mathematical functions are applied singularly and two-at-a-time.
-
31. The method of claim 19, wherein the normalization transform is combined with each of the first derivative, second derivative, and smoothing transforms;
- the first derivative transform is combined with the normalization, and smoothing transforms;
the second derivative transform is combined with the normalization and smoothing transforms;
the multiplicative scatter correction transform is combined with first derivative, second derivative, Kubelka-Munk, and smoothing transforms;
the Kubelka-Munk transform is combined with the normalization, first derivative, second derivative, multiplicative scatter correction, and smoothing transforms;
the smoothing transform is combined with the a, normalization, first derivative, second derivative, multiplicative scatter correction, and Kubelka-Munk transforms; and
the conversion from reflectance/transmittance to absorbance is combined with is combined with the normalization, first derivative, second derivative, multiplicative scatter correction, and smoothing transforms.
- the first derivative transform is combined with the normalization, and smoothing transforms;
-
32. A computer executable process, operative to control a computer, stored on a computer readable medium, for determining analyzing a set of data on a computer readable medium, the set of data including, for each of a plurality of samples, corresponding spectral data and a corresponding constituent value, the process comprising the steps of:
-
dividing the spectral data into a calibration sub-set of spectral data and a validation sub-set of spectral data;
applying a plurality of data transforms to the spectral data in the validation sub-set and the calibration sub-set;
applying one or more of a partial least squares, a principal component regression, a neural net, or a multiple linear regression analysis on the transformed and untransformed data sets of the spectral data in the calibration sub-set to obtain a plurality of modeling equations;
applying the spectral data in the validation sub-set to each of the plurality of modeling equations to obtain corresponding values;
and processing the values in order to select a best modeling equation for analyzing the spectral data. - View Dependent Claims (33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48)
(a) a normalization of the spectral data, a smoothing transform; and
(b) a Savitsky-Golay first derivative, or a Savitsky-Golay second derivative of the spectral data.
-
-
39. The process of claim 32, wherein the data transforms include performing a first derivative of the spectral data, and a normalization, a multiplicative scatter correction, or a smoothing transform on the spectral data.
-
40. The process of claim 32, wherein the processing step includes identifying a best modeling equation as a function of the standard error of estimate SEP of the validation data.
-
41. The process of claim 32, wherein the processing step includes identifying a best modeling equation as a function of standard error of estimate SEE of the calibration data and the standard error of estimate SEP of the validation data.
-
42. The process of claim 32, wherein the processing step includes a best modeling equation as a function of a weighted average of standard error of estimate SEE of the calibration data and the standard error of estimate SEP of the validation data.
-
43. The process of claim 32, wherein the processing the values step comprises calculating a figure of merit (FOM) for each modeling equation, the FOM being defined as:
-
44. The process of claim 32, wherein the processing the values step comprises calculating a figure of merit (FOM) for each modeling equation, the FOM being defined as:
-
45. The process of claim 32, wherein the data transforms include performing a conversion from reflectance/transmittance to absorbance, a normalization, a multiplicative scatter correction, and a smoothing transform on the spectral data.
-
46. The process of claim 32, wherein the data transforms include performing a baseline correction, a normalization, a first derivative, performing a second derivative, a multiplicative scatter correction, a smoothing transform, a conversion from reflectance/transmittance to absorbance, a Kubelka-Munk function, a ratio, a Savitsky-Golay first derivative, a Savitsky-Golay second derivative, a mean-centering, and a conversion from reflectance/transmittance to absorbance on the spectral data.
-
47. The process of claim 32, wherein the data transforms are applied singularly and two-at-a-time.
-
48. The method of claim 32, wherein the normalization transform is combined with each of the first derivative, second derivative, and smoothing transforms;
- the first derivative transform is combined with the normalization, and smoothing transforms;
the second derivative transform is combined with the normalization and smoothing transforms;
the multiplicative scatter correction transform is combined with conversion from reflectance/transmittance to absorbance, first derivative, second derivative, Kubelka-Munk function, and smoothing transform;
the Kubelka-Munk function is combined with the normalization, first derivative, second derivative, multiplicative scatter correction, and smoothing transforms;
the smoothing transform is combined with the conversion from reflectance/transmittance to absorbance, normalization, first derivative, second derivative, multiplicative scatter correction, and Kubelka-Munk transforms; and
the conversion from reflectance/transmittance to absorbance is combined with the normalization, first derivative, second derivative, multiplicative scatter correction, and smoothing transforms.
- the first derivative transform is combined with the normalization, and smoothing transforms;
-
49. An automated method for modeling spectral data, the spectral data generated by one of diffuse reflectance, clear transmission, or diffuse transmission, comprising the steps of
accessing a set of spectral data, the set of spectral data including, corresponding spectral data for each of a plurality of samples, the spectral data for each of the plurality of samples having associated therewith at least one constituent value, the at least one constituent value being a reference value for a target substance in the sample which is measured by a independent measurement technique; -
applying a plurality of data transforms to the set of spectral data to generate, for each sample, a set of transformed and untransformed calibration data;
dividing the set of spectral data, with its associated constituent values, into a calibration sub-set and a validation sub-set;
applying one or more of a partial least squares, a principal component regression, a neural net, or a multiple linear regression analysis on the transformed and untransformed calibration data sub-sets to obtain corresponding modeling equations for predicting the amount of the target substance in a sample;
identifying a best modeling equation as a function of the correlation between the spectral data in the validation sub-set and the corresponding constituent values in the validation sub-set. - View Dependent Claims (50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66)
(a) a normalization of the spectral data; and
(b) a smoothing transform, a Savitsky-Golay first derivative, or a Savitsky-Golay second derivative of the spectral data.
-
-
57. The method of claim 49, wherein the data transforms include performing:
-
(a) a first derivative of the spectral data; and
(b) a normalization, a multiplicative scatter correction, or a smoothing transform on the spectral data.
-
-
58. The method of claim 49, wherein the data transforms include two or more of performing a baseline correction, performing a normalization of the spectral data, performing a first derivative on the spectral data, performing a second derivative on the spectral data, performing a multiplicative scatter correction on the spectral data, performing smoothing transform on the spectral data, performing a Kubelka-Munk function on the spectral data, performing a ratio on the spectral data, performing a Savitsky-Golay first derivative, performing a Savitsky-Golay second derivative, performing a mean-centering, and performing a conversion from reflectance/transmittance to absorbance.
-
59. The method of claim 58, wherein the data transforms are applied singularly and two-at-a-time.
-
60. The method of claim 50, wherein the baseline correction is combined with each of the normalization, the Kubelka-Munk function, the smoothing transform or conversion from reflectance/transmittance to absorbance;
- the normalization is combined with each of the baseline correction, conversion from reflectance/transmittance to absorbance, first derivative, second derivative, Kubelka-Munk function, smoothing transform, Savitsky-Golay first derivative, or Savitsky-Golay second derivative;
the first derivative transform is combined with each of the baseline correction, normalization, smoothing transform, multiplicative scatter correction, Kubelka-Munk function, or conversion from reflectance/transmittance to absorbance;
the second derivative transform is combined with each of the baseline correction, normalization, smoothing transform, the multiplicative scatter correction, or the Kubelka-Munk function;
the multiplicative scatter correction is combined with each of the Kubelka-Munk function, smoothing transform and conversion from reflectance/transmittance to absorbance;
the Kubelka-Munk function is combined with each of multiplicative scatter correction and smoothing transform;
the smoothing transform is combined with each of the baseline correction, conversion from reflectance/transmittance to absorbance, normalization, first derivative, second derivative, multiplicative scatter correction, and Kubelka-Munk function;
the Savitsky-Golay first derivative is combined with each of the baseline correction, normalization, multiplicative scatter correction, the Kubelka-Munk function, the smoothing transform, and conversion from reflectance/transmittance to absorbance;
the Savitsky-Golay second derivative is combined with each of the baseline correction, normalization transform, the multiplicative scatter correction, the Kubelka-Munk function, smoothing transform, and conversion from reflectance/transmittance to absorbance; and
the conversion from reflectance/transmittance to absorbance is combined with each of the baseline correction, normalization, the multiplicative scatter correction, and smoothing transform.
- the normalization is combined with each of the baseline correction, conversion from reflectance/transmittance to absorbance, first derivative, second derivative, Kubelka-Munk function, smoothing transform, Savitsky-Golay first derivative, or Savitsky-Golay second derivative;
-
61. The method of claim 49, wherein the data transform is a ratio comprising a denominator and a numerator and said numerator comprises a baseline correction, normalization, multiplicative scatter correction, smoothing transform, or Kubelka-Munk function, when the denominator comprises baseline correction;
- the numerator comprises normalization when the denominator comprises normalization;
the numerator comprises first derivative when the denominator comprises first derivative, the numerator comprises second derivative when the denominator comprises second derivative;
the numerator comprises multiplicative scatter correction when the denominator comprises multiplicative scatter correction;
the numerator comprises Kubelka-Munk function when the denominator comprises Kubelka-Munk function;
the numerator comprises smoothing transform when the denominator comprises smoothing transform;
the numerator comprises Savitsky-Golay first derivative when the denominator comprises Savitsky-Golay first derivative;
the numerator comprises Savitsky-Golay second derivative when the denominator comprises Savitsky-Golay second derivative.
- the numerator comprises normalization when the denominator comprises normalization;
-
62. The method of claim 49, wherein the identifying step includes identifying a best mode equation as a function of the standard error of estimate SEP of the validation data.
-
63. The method of claim 49, wherein the identifying step includes identifying a best mode equation as a function of standard error of estimate SEE of the calibration data and the Standard Error of Estimate SEP of the validation data.
-
64. The method of claim 49, wherein the identifying step includes a best mode equation as a function of a weighted average of standard error of estimate SEE of the calibration data and the Standard Error of Estimate SEP of the validation data.
-
65. The method of claim 49, wherein the identifying step includes calculating a figure of merit (FOM) for each modeling equation, the FOM being defined as:
-
66. The method of claim 49, wherein the identifying step includes calculating a figure of merit (FOM) for each modeling equation, the FOM being defined as:
Specification