Method for analyzing mass spectra
First Claim
Patent Images
1. A method that analyzes mass spectra using a digital computer, the method comprising:
- a) entering into the digital computer a data set obtained from mass spectra from a plurality of samples, wherein each sample is, or is to be assigned to a class within a class set comprising two or more classes, each class characterized by a different biological status, and wherein each mass spectrum comprises data representing signal strength as a function of time-of-flight, mass-to-charge ratio, or a value derived from time-of-flight or mass-to-charge ratio; and
b) forming a classification model which discriminates between the classes in the class set, wherein forming comprises analyzing the data set by executing code that embodies a classification process comprising a recursive partitioning process, which is a classification and regression tree process.
2 Assignments
0 Petitions
Accused Products
Abstract
A method that analyzes mass spectra using a digital computer is disclosed. The method includes entering into a digital computer a data set obtained from mass spectra from a plurality of samples. Each sample is, or is to be assigned to a class within a class set having two or more classes and each class is characterized by a different biological status. A classification model is then formed. The classification model discriminates between the classes in the class set.
268 Citations
102 Claims
-
1. A method that analyzes mass spectra using a digital computer, the method comprising:
-
a) entering into the digital computer a data set obtained from mass spectra from a plurality of samples, wherein each sample is, or is to be assigned to a class within a class set comprising two or more classes, each class characterized by a different biological status, and wherein each mass spectrum comprises data representing signal strength as a function of time-of-flight, mass-to-charge ratio, or a value derived from time-of-flight or mass-to-charge ratio; and
b) forming a classification model which discriminates between the classes in the class set, wherein forming comprises analyzing the data set by executing code that embodies a classification process comprising a recursive partitioning process, which is a classification and regression tree process. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37)
detecting signals in the mass spectra, each mass spectrum comprising data representing signal strength as a function of mass-to-charge ratio;
clustering the signals having similar mass-to-charge ratios into signal clusters;
selecting signal clusters having at least a predetermined number of signals with signal intensities above a predetermined value;
identifying the mass-to-charge ratios corresponding to the selected signal clusters; and
forming the data set using signal intensities at the identified mass-to-charge ratios.
-
-
15. The method of claim 1 wherein forming the classification model comprises at least one of identifying features that discriminate between the different biological statuses, and learning.
-
16. The method of claim 1 wherein the classification process is a binary recursive partitioning process.
-
17. The method of claim 1 further comprising:
c) interrogating the classification model to determine if one or more features discriminate between the different biological statuses.
-
18. The method of claim 1 further comprising:
c) repeating a) and b) using a larger plurality of samples.
-
19. The method of claim 1 wherein the mass spectra are derived from a surface enhanced laser desorption/ionization process using a substrate comprising an affinity material, wherein the affinity material comprises antibodies.
-
20. A method for classifying an unknown sample into a class characterized by a biological status using a digital computer, the method comprising:
-
a) entering data obtained from a mass spectrum of the unknown sample into a digital computer, wherein the mass spectrum is derived from a surface enhanced laser desorption/ionization process using a substrate comprising an affinity material, wherein the affinity material comprises antibodies; and
b) processing the mass spectrum data using the classification model formed by the method of claim 1 to classify the unknown sample in a class characterized by a biological status.
-
-
21. The method of claim any of claims 1, 2, and 6-11 wherein each mass spectrum comprises data representing signal strength as a function of mass-to-charge ratio.
-
22. The method of any of claims 2, and 6-11 wherein the data set is formed by:
-
detecting signals in the mass spectra, each mass spectrum comprising data representing signal strength as a function of mass-to-charge ratio;
clustering the signals having similar mass-to-charge ratios into signal clusters;
selecting signal clusters having at least a predetermined number of signals with signal intensities above a predetermined value;
identifying the mass-to-charge ratios corresponding to the selected signal clusters; and
forming the data set using signal intensities at the identified mass-to-charge ratios.
-
-
24. The method of claim 1 wherein the different classes are selected from exposure to a drug, exposure to one of a class of drugs and lack of exposure to a drug or one of a class of drugs.
-
25. The method of claim 1 wherein the each mass spectrum comprises data representing signal strength as a function mass-to-charge ratio or a value derived from mass-to-charge ratio.
-
26. A method for classifying an unknown sample into a class characterized by a biological status using a digital computer, the method comprising:
-
a) entering data obtained from a mass spectrum of the unknown sample into a digital computer; and
b) processing the mass spectrum data using the classification model formed by the method of claim 1 to classify the unknown sample in a class characterized by a biological status.
-
-
27. The method of claim 26 wherein the different biological statuses comprise un-diseased, low grade cancer and high grade cancer.
-
28. The method of claim 26 wherein the class is characterized by exposure to a drug of one of a class of drugs.
-
29. The method of claim 26 wherein the class is characterized by response to a drug.
-
30. The method of claim 26 wherein the class is characterized by a toxicity status.
-
31. A method for estimating the likelihood that an unknown sample is accurately classified as belonging to a class characterized by a biological status using a digital computer, the method comprising:
-
a) entering data obtained from a mass spectrum of the unknown sample into a digital computer; and
b) processing the mass spectrum data using the classification model formed by the method of claim 1 to estimate the likelihood that the unknown sample is accurately classified into a class characterized by a biological status.
-
-
32. A computer readable medium comprising:
-
a) code for entering data obtained from a mass spectrum of an unknown sample into a digital computer; and
b) code for processing the mass spectrum data using the classification model formed by the method of claim 1 to classify the unknown sample in a class characterized by a biological status.
-
-
33. A system comprising:
-
a gas phase ion spectrometer;
a digital computer adapted to process data from the gas phase ion spectrometer; and
the computer readable medium of claim 32 in operative association wit the digital computer.
-
-
34. The system of claim 33 wherein the gas phase ion spectrometer is adapted to perform a laser desorption ionization process.
-
35. A computer readable medium comprising:
-
a) code for entering data obtained from a mass spectrum of an unknown sample into a digital computer; and
b) code for processing the mass spectrum data using the classification model formed by the method of claim 1 to estimate the likelihood that the unknown sample is accurately classified into a class characterized by a biological status.
-
-
36. A system comprising:
-
a gas phase ion spectrometer;
a digital computer adapted to process data from the gas phase ion spectrometer; and
the computer readable medium of claim 35 in operative association with the digital computer.
-
-
37. The system of claim 36 wherein the gas phase ion spectrometer is adapted to perform a laser desorption ionization process.
-
23. A method that analyzes mass spectra using a digital computer, the method comprising:
-
a) entering into the digital computer a data set obtained from mass spectra from a plurality of samples, wherein each sample is, or is to be assigned to a class within a class set comprising two or more classes, each class characterized by a different biological status, and wherein each mass spectrum comprises data representing signal strength as a function of time-of-flight, mass-to-charge ratio, or a value derived from time-of-flight or mass-to-charge ratio; and
b) forming a classification model which discriminates between the classes in the class set, wherein forming comprises analyzing the data set by executing code that embodies a classification process comprising a recursive partitioning process, and wherein the method further comprises forming the data set, wherein forming the data set comprises obtaining raw data from the mass spectra and then preprocessing the raw mass spectra data to form the data set. - View Dependent Claims (38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71)
detecting signals in the mass spectra, each mass spectrum comprising data representing signal strength as a function of mass-to-charge ratio;
clustering the signals having similar mass-to-charge ratios into signal clusters;
selecting signal clusters having at least a predetermined number of signals with signal intensities above a predetermined value;
identifying the mass-to-charge ratios corresponding to the selected signal clusters; and
forming the data set using signal intensities at the identified mass-to-charge ratios.
-
-
51. The method of claim 23 wherein forming the classification model comprises at least one of identifying features that discriminate between the different biological statuses, and learning.
-
52. The method of claim 23 wherein the classification process is a binary recursive partitioning process.
-
53. The method of claim 23 further comprising:
c) interrogating the classification model to determine if one or more features discriminate between the different biological statuses.
-
54. The method of claim 23 further comprising:
c) repeating a) and b) using a larger plurality of samples.
-
55. The method of claim 23 wherein the different classes are selected from exposure to a drug, exposure to one of a class of drugs and lack of exposure to a drug or one of a class of drugs.
-
56. The method of claim 23 wherein the each mass spectrum comprises data representing signal strength as a function mass-to-charge ratio or a value derived from mass-to-charge ratio.
-
57. A method for classifying an unknown sample into a class characterized by a biological status using a digital computer, the method comprising:
-
a) entering data obtained from a mass spectrum of the unknown sample into a digital computer; and
b) processing the mass spectrum data using the classification model formed by the method of claim 23 to classify the unknown sample in a class characterized by a biological status.
-
-
58. The method of claim 57 wherein the class is characterized by a disease status.
-
59. The method of claim 57 wherein the different biological statuses comprise un-diseased, low grade cancer and high grade cancer.
-
60. The method of claim 57 wherein the class is characterized by exposure to a drug of one of a class of drugs.
-
61. The method of claim 57 wherein the class is characterized by response to a drug.
-
62. The method of claim 57 wherein the class is characterized by a toxicity status.
-
63. A method for estimating the likelihood that an unknown sample is accurately classified as belonging to a class characterized by a biological status using a digital computer, the method comprising:
-
a) entering data obtained from a mass spectrum of the unknown sample into a digital computer; and
b) processing the mass spectrum data using the classification model formed by the method of claim 23 to estimate the likelihood that the unknown sample is accurately classified into a class characterized by a biological status.
-
-
64. A computer readable medium comprising:
-
a) code for entering data obtained from a mass spectrum of an unknown sample into a digital computer; and
b) code for processing the mass spectrum data using the classification model formed by the method of claim 23 to classify the unknown sample in a class characterized by a biological status.
-
-
65. A system comprising:
-
a gas phase ion spectrometer;
a digital computer adapted to process data from the gas phase ion spectrometer; and
the computer readable medium of claim 64 in operative association with the digital computer.
-
-
66. The system of claim 65 wherein the gas phase ion spectrometer is adapted to perform a laser desorption ionization process.
-
67. A computer readable medium comprising:
-
a) code for entering data obtained from a mass spectrum of an unknown sample into a digital computer; and
b) code for processing the mass spectrum data using the classification model formed by the method of claim 23 to estimate the likelihood that the unknown sample is accurately classified into a class characterized by a biological status.
-
-
68. The method of claim 23 wherein the mass spectra are derived from a surface enhanced laser desorption/ionization process using a substrate comprising an affinity material, wherein the affinity material comprises antibodies.
-
69. A method for classifying an unknown sample into a class characterized by a biological status using a digital computer, the method comprising:
-
a) entering data obtained from a mass spectrum of the unknown sample into a digital computer, wherein the mass spectrum is derived from a surface enhanced laser desorption/ionization process using a substrate comprising an affinity material, wherein the affinity material comprises antibodies; and
b) processing the mass spectrum data using the classification model formed by the method of claim 23 to classify the unknown sample in a class characterized by a biological status.
-
-
70. The method of claim any of claims 23, 38, and 42-47 wherein each mass spectrum comprises data representing signal strength as a function of mass-to-charge ratio.
-
71. The method of any of claims 38, and 42-47 wherein the data set is formed by:
-
detecting signals in the mass spectra, each mass spectrum comprising data representing signal strength as a function of mass-to-charge ratio;
clustering the signals having similar mass-to-charge ratios into signal clusters;
selecting signal clusters having at least a predetermined number of signals with signal intensities above a predetermined value;
identifying the mass-to-charge ratios corresponding to the selected signal clusters; and
forming the data set using signal intensities at the identified mass-to-charge ratios.
-
-
72. A system comprising:
-
a gas phase ion spectrometer;
a digital computer adapted to process data from the gas phase ion spectrometer; and
a computer readable medium in operative association with the digital computer, wherein the computer readable medium comprises a) code for entering a data set derived from mass spectra from a plurality of samples, wherein each sample is, or is to be assigned to a class within a class set of two or more classes, each class characterized by a different biological status, and wherein each mass spectrum comprises data representing signal strength as a function of time-of-flight, mass-to-charge ratio or a value derived from mass-to-charge ratio or time-of-flight, and b) code for forming a classification model using a classification process, the classification process comprising a recursive partitioning process, wherein the classification model discriminates between the classes in the class set. - View Dependent Claims (73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102)
detecting signals in the mass spectra, each mass spectrum comprising data representing signal strength as a function of mass-to-charge ratio;
clustering the signals having similar mass-to-charge ratios into signal clusters;
selecting signal clusters having at least a predetermined number of signals with signal intensities above a predetermined value;
identifying the mass-to-charge ratios corresponding to the selected signal clusters; and
forming the data set using signal intensities at the identified mass-to-charge ratios.
-
-
88. The system of claim 72 wherein the code for forming the classification model comprises code for at least one of identifying features that discriminate between the different biological statuses, and learning.
-
89. The system of claim 72 wherein the classification process is a binary recursive partitioning process.
-
90. The system of claim 72 wherein the computer readable medium further comprises:
code for interrogating the classification model to determine if one or more features discriminate between the different biological statuses.
-
91. The system of claim 72 wherein the computer readable medium further comprises:
code for repeating entering data and for forming the classification model using a larger plurality of samples.
-
92. The system of claim 72 wherein the different classes are selected from exposure to a drug, exposure to one of a class of drugs and lack of exposure to a drug or one of a class of drugs.
-
93. The system of claim 72 wherein the each mass spectrum comprises data representing signal strength as a function mass-to-charge ratio or a value derived from mass-to-charge ratio.
-
94. The system of claim 72 wherein the computer readable medium further comprises:
-
a) code for entering data obtained from a mass spectrum of an unknown sample into the digital computer; and
b) code for processing the mass spectrum data using the classification model to classify the unknown sample in a class characterized by a biological status.
-
-
95. The system of claim 72 wherein the class is characterized by a disease status.
-
96. The system of claim 72 wherein the different biological statuses comprise un-diseased, low grade cancer and high grade cancer.
-
97. The system of claim 72 wherein the class is characterized by exposure to a drug of one of a class of drugs.
-
98. The system of claim 72 wherein the class is characterized by response to a drug.
-
99. The system of claim 72 wherein the class is characterized by a toxicity status.
-
100. The system of claim 72 wherein the system is adapted to estimate the likelihood that an unknown sample is accurately classified as belonging to a class characterized by a biological status, and wherein the computer readable medium further comprises:
-
a) code for entering data obtained from a mass spectrum of the unknown sample into a digital computer; and
b) code for processing the mass spectrum data using the classification model to estimate the likelihood that the unknown sample is accurately classified into a class characterized by a biological status.
-
-
101. The system of any of claims 72, 75, and 79-84 wherein each mass spectrum comprises data representing signal strength as a function of mass-to-charge ratio.
-
102. The system of any of claims 75, and 79-84 wherein the data set is formed by:
-
detecting signals in the mass spectra, each mass spectrum comprising data representing signal strength as a function of mass-to-charge ratio;
clustering the signals having similar mass-to-charge ratios into signal clusters;
selecting signal clusters having at least a predetermined number of signals with signal intensities above a predetermined value;
identifying the mass-to-charge ratios corresponding to the selected signal clusters; and
forming the data set using signal intensities at the identified mass-to-charge ratios.
-
Specification