Model for spectral and chromatographic data
First Claim
1. A method of determining whether a sample matches a reference species, the method comprising:
- selecting N indices l1, l2, . . . lN of peaks in an indexed data set characterizing the reference species;
selecting a first set of probabilities p1, p2, . . . pN that peaks will occur at indices l1, l2, . . . lN, respectively, of an indexed data set that characterizes the sample when the sample matches the reference species;
selecting a second set of probabilities q1, q2, . . . qN that peaks will occur at indices l1, l2, . . . lN, respectively, of an indexed data set that characterizes the sample when the sample does not match the reference species;
choosing a threshold Kc;
obtaining an indexed observation data set x1, x2, . . . xN, where xj ∈
{0, 1} and xj=1 if and only if a peak is present in the sample at lj;
deciding that the sample matches the reference species if λ
≦
Kc where anddeciding that the sample does not match the reference species if λ
>
Kc.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus using a spectral analysis technique are disclosed. In one form of the invention, probabilities are selected to characterize the presence (and in another form, also a quantification of a characteristic) of peaks in an indexed data set for samples that match a reference species, and other probabilities are selected for samples that do not match the reference species. An indexed data set is acquired for a sample, and a determination is made according to techniques exemplified herein as to whether the sample matches or does not match the reference species. When quantification of peak characteristics is undertaken, the model is appropriately expanded, and the analysis accounts for the characteristic model and data. Further techniques are provided to apply the methods and apparatuses to process control, cluster analysis, hypothesis testing, analysis of variance, and other procedures involving multiple comparisons of indexed data.
-
Citations
29 Claims
-
1. A method of determining whether a sample matches a reference species, the method comprising:
-
selecting N indices l1, l2, . . . lN of peaks in an indexed data set characterizing the reference species;
selecting a first set of probabilities p1, p2, . . . pN that peaks will occur at indices l1, l2, . . . lN, respectively, of an indexed data set that characterizes the sample when the sample matches the reference species;
selecting a second set of probabilities q1, q2, . . . qN that peaks will occur at indices l1, l2, . . . lN, respectively, of an indexed data set that characterizes the sample when the sample does not match the reference species;
choosing a threshold Kc;
obtaining an indexed observation data set x1, x2, . . . xN, where xj ∈
{0, 1} and xj=1 if and only if a peak is present in the sample at lj;
deciding that the sample matches the reference species if λ
≦
Kc whereand deciding that the sample does not match the reference species if λ
>
Kc.- View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method of detelaring whether a sample matches a reference species, the method comprising:
-
selecting N indices l1, l2, . . . lN of peaks in an indexed data set characterizing the reference species;
selecting a first set of probabilities p1, p2, . . . pN that peaks will occur at indices l1, l2, . . . lN of an indexed data set that characterizes the sample when the sample matches the reference species;
selecting a first set of probability density functions gi(yi;
θ
i) that characterize a measurable feature yi of the peak at index li given the presence of a peak at index li of a data set that characterizes the sample when the sample matches the reference species;
selecting a second set of probabilities q1, q2, . . . qN that peaks will occur at indices l1, l2, . . . lN of an indexed data set that characterizes the sample when the sample does not match the reference species;
selecting a second set of probability density functions gi(yi;
Ω
i) that characterize the measurable feature yi of the peak at index li given the presence of a peak at index li of a data set that characterizes the sample when the sample does not match the reference species;
selecting a threshold Kc;
obtaining an indexed observation data set x1, x2, . . . xN where xi∈
{0, 1} and xi=1 if and only if a peak is present in the sample at li;
obtaining a feature data set yi, y2, . . . yN; and
deciding that the sample matches the reference species if λ
≦
Kc whereand deciding that the sample does not match the reference species if λ
>
Kc.- View Dependent Claims (8, 9, 10, 11, 12, 13)
-
-
14. A method, wherein the status of a process at any point t in time is characterized by an indexed observation data set Xt={x1,t, x2,t, . . . xN,t}, where xj,t ∈
- {0, 1} and xj,t=1 if and only if a peak is present at time t in the sample at index lj, the method comprising;
selecting a first set of probabilities p1, p2, . . . pN that peaks will occur at x1,t, x2,t, . . . xN,t, respectively, when the process is operating normally;
selecting a second set of probabilities q1, q2, . . . qN that peaks will occur at x1,t, x2,t, . . . xN,t, respectively, when the process is not operating normally;
acquiring a sequence X1, X2, . . . XT of indexed observation data sets;
intervening in the process when it is determined that Cn equals or exceeds a predetermined value A, where - View Dependent Claims (15, 16)
- {0, 1} and xj,t=1 if and only if a peak is present at time t in the sample at index lj, the method comprising;
-
17. A method,
wherein the status of a process at any point t in time is characterized by an indexed observation data set Xt={x1,t, x2,t, . . . xN,t}, where xj,t ∈ - {0, 1} and xj,t=1 if and only if a peak is present at time t in the sample at index lj, and
a feature data set Yt={y1,t, y2,t, . . . yN,t}, where if xj,t=0, yj,t=0, and if xj,t=1, yj,t quantifies a feature of the peak at time t in the sample at index lj, the method comprising; selecting a first set of probabilities p1, p2, . . . pN that peaks will occur at x1,t, x2,t, . . . xN,t, respectively, when the process is operating normally;
selecting a first set of probability density functions gi(yi;
θ
i) that characterize a measurable feature yi of the peak at index li given the presence of a peak at index li of a data set that characterizes the process when it is operating normally;
selecting a second set of probabilities q1, q2, . . . qN that peaks will occur at x1,t, x2,t, . . . xN,t, respectively, when the process is not operating normally;
selecting a second set of probability density functions gi(yi;
Ω
i) that characterize the measurable feature yi of the peak at index li given the presence of a peak at index li of a data set that characterizes the process when it is operating normally;
acquiring a sequence X1, X2, . . . XT of indexed observation data sets;
acquiring a sequence Y1, Y2, . . . YT of feature data sets;
intervening in the process when it is determined that Cn equals or exceeds a predetermined value A, where - View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25)
- {0, 1} and xj,t=1 if and only if a peak is present at time t in the sample at index lj, and
-
26. A system for analyzing a sample in comparison with a reference species, comprising:
-
a processor;
a memory storing data indicative of;
probabilities p1, p2, . . . pN that peaks will occur at indices l1, l2, . . . lN of an indexed data set that characterizes the sample when the sample matches the reference species;
probabilities q1, q2, . . . qN that peaks will occur at indices l1, l2, . . . lN of an indexed data set that characterizes the sample when the sample does not match the reference species;
a threshold value; and
an indexed sample data set x1, x2, . . . xN characterizing the sample, wherein each xi is a binary value that indicates whether or not a peak is present at index li; and
a computer-readable medium encoded with programming instructions executable by said processor to;
calculate a log-likelihood ratio λ
, wheregenerate a first signal when λ
is less than said threshold value; and
generate a second signal when λ
is greater than said threshold value.
-
-
27. A method of performing discriminant analysis, the method comprising:
-
selecting N indices l1, l2, . . . lN of peaks in an indexed data set characterizing a first reference species or a second reference species;
selecting a first set of probabilities p1,1, p2,1, . . . pN,1 that peaks will occur at indices l1, l2, . . . lN, respectively, of an indexed data set that characterizes the sample when the sample matches the first reference species;
selecting a second set of probabilities p1,2, p2,2, . . . pN,2 that peaks will occur at indices l1, l2, . . . lN, respectively, of an indexed data set that characterizes the sample when the sample matches the second reference species;
selecting a third set of probabilities q1,1, q2,1, . . . qN,1 that peaks will occur at indices l1, l2, . . . lN, respectively, of an indexed data set that characterizes the sample when the sample matches a second reference species;
selecting a fourth set of probabilities q1,2, q2,2, . . . qN,2 that peaks will occur at indices l1, l2, . . . lN, respectively, of an indexed data set that characterizes the sample when the sample matches a second reference species;
obtaining an indexed observation data set x1, x2, . . . xN, where xj∈
{0, 1} and xj=1 if and only if a peak is present in the sample at lj;
calculating deciding that the sample matches the first reference species if λ
1≦
λ
2; and
the sample matches the second reference species if λ
1>
λ
2.
-
-
28. A method of performing a cluster analysis of M samples, comprising:
-
selecting N indices l1, l2, .. lN of possible peak locations in indexed data sets characterizing the M samples;
obtaining indexed data sets Xi={x1,i, x2,i, . . . xN,i};
i=1, 2, . . . M, each data set corresponding to a different sample, wherein xj,i={0, 1} and xj,i=1 if and only if a peak exists in the data set for sample i at index lj; and
defining P groups of samples by selecting a first array of probabilities pk,i;
k=1, 2, . . . P;
i=1, 2, . . . N that peaks will occur at indices l1, l2, . . . lN, respectively, of an indexed data set that characterizes a sample when the sample is in group k;
selecting a second array of probabilities qk,i;
k=1, 2, . . . P;
i=1, 2, . . . N that peaks will occur at indices l1, l2, . . . lN, respectively, of an indexed data set that characterizes the sample when the sample is not in group k; and
selecting gj ∈
{1, 2, . . . P};
j=1, 2, . . . M, where sample j is in group gj;
wherein pk,i, qk,i, and gj are selected to maximize- View Dependent Claims (29)
-
Specification