Bioinformatic Approach to Disease Diagnosis

US 20080064118A1
Filed: 09/08/2007
Published: 03/13/2008
Est. Priority Date: 09/08/2006
Status: Active Grant

First Claim

Patent Images

1. A method for constructing a multivariate predictive model for diagnosing a disease for which a plurality of test methods are individually inadequate, said method comprising:

(a) performing a panel of laboratory tests for diagnosing said disease on a test population comprising a statistically significant sample of individuals with at least one objective sign of disease and a statistically significant control sample of healthy individuals or persons with cross-reacting medical conditions;

(b) generating a score function from a linear combination of said test panel results, said linear combination expressed as β

^TY, wherein D is the disease;

Y₁, . . . , Y_kis a set of K diagnostic tests for D;

Y is a vector of diagnostic test results {Y₁, . . . , Y_k};

D′

=not D;

β

is a vector of coefficients {β

₁, . . . , β

_k} for Y; and

β

^Tis the transpose of β

;

(c) performing a receiver operating characteristic (ROC) regression or alternative regression technique of the score function, wherein the test panel is selected and β

coefficients are calculated simultaneously to maximize the area under the curve (AUC) of the empiric ROC as approximated by;

$A U C (β) = \frac{1}{n^{D} \cdot n^{H}} \sum_{i D, j H}^{} I (β^{T} Y_{i} > β^{T} Y_{j}),$ wherein I is a sigmoid function, N=the number of study subjects, n^Din the number of patients with disease D, n^His the number of healthy controls, n^D+n^H=N;

i=1, . . . , n^D, i D are patients with disease;

j=1, . . . , n^H, j C H are healthy controls;

(d) calculating for each individual the pre-test odds of disease;

generating a diagnostic likelihood ratio of disease by determining the frequency of each individual'"'"'s test score in said diseased population relative to said control population; and

multiplying said pretest odds by said likelihood ratio to determine the post-test odds of disease for each individual;

(e) converting a set of posttest odds into posttest probabilities for each methodology and creating an ROC curve for each methodology by altering its respective post-test probability cutoff value;

(f) comparing the ROC areas generated by one or more regression techniques to determine an optimal methodology, comprising the tests to be included in an optimum test panel and the weight to be assigned each test score alone or in combination;

(g) dichotomizing the optimal methodology by finding that point on the final ROC graph tangent to a line with a slope of (1−

p)·

C/p·

B, where p is the population prevalence of disease, B is the regret associated with failing to treat patients with disease and C is the regret associated with treating a patient without disease;

thereby generating a posttest probability cutoff value; and

(h) displaying the optimum test panel for disease diagnosis, the weight each individual test score is to be assigned alone or in combination, and the cutoff value against which positive or negative diagnoses are to be made.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods for constructing multivariate predictive models for diagnosing diseases for which test methods are individually inadequate, including: (a) performing laboratory tests on a statistically significant test population of individuals; (b) generating a score function from a linear combination of test panel results;

- (c) performing a receiver operating characteristic (ROC) regression or alternative regression technique of the score function using those tests and β coefficients calculated simultaneously to maximize the area under the curve (AUC) of the function and chosen simultaneously to generate the largest area below that portion of the ROC curve for the (1−t₀) quantile of individuals without disease, where t₀represents the maximum acceptable false-positive rate; (d) calculating individual pre-test disease odds; generating a diagnostic likelihood ratio of disease by determining the frequency of each individual'"'"'s test score in the diseased population relative to the control population; and multiplying pre-test odds by the likelihood ratio to determine individual post-test disease odds; (e) converting a set of posttest odds into posttest probabilities for each potential multivariate methodology and creating an ROC curve for each methodology by altering posttest probability cutoff values; (f) comparing partial ROC areas generated by one or more regression techniques to determine the optimal methodology; and (g) dichotomizing the optimal methodology by finding that point on the ROC curve tangent to a line with slope (1−p) C/p·B, where p is population prevalence of disease, B is regret associated with failing to treat patients with disease and C is regret associated with treating a patient without disease.

Citations

25 Claims

1. A method for constructing a multivariate predictive model for diagnosing a disease for which a plurality of test methods are individually inadequate, said method comprising:
- (a) performing a panel of laboratory tests for diagnosing said disease on a test population comprising a statistically significant sample of individuals with at least one objective sign of disease and a statistically significant control sample of healthy individuals or persons with cross-reacting medical conditions;
  
  (b) generating a score function from a linear combination of said test panel results, said linear combination expressed as β
  
  ^TY, wherein D is the disease;
  
  Y₁, . . . , Y_kis a set of K diagnostic tests for D;
  
  Y is a vector of diagnostic test results {Y₁, . . . , Y_k};
  
  D′
  
  =not D;
  
  β
  
  is a vector of coefficients {β
  
  ₁, . . . , β
  
  _k} for Y; and
  
  β
  
  ^Tis the transpose of β
  
  ;
  
  (c) performing a receiver operating characteristic (ROC) regression or alternative regression technique of the score function, wherein the test panel is selected and β
  
  coefficients are calculated simultaneously to maximize the area under the curve (AUC) of the empiric ROC as approximated by;
  
  $A U C (β) = \frac{1}{n^{D} \cdot n^{H}} \sum_{i D, j H}^{} I (β^{T} Y_{i} > β^{T} Y_{j}),$ wherein I is a sigmoid function, N=the number of study subjects, n^Din the number of patients with disease D, n^His the number of healthy controls, n^D+n^H=N;
  
  i=1, . . . , n^D, i D are patients with disease;
  
  j=1, . . . , n^H, j C H are healthy controls;
  
  (d) calculating for each individual the pre-test odds of disease;
  
  generating a diagnostic likelihood ratio of disease by determining the frequency of each individual'"'"'s test score in said diseased population relative to said control population; and
  
  multiplying said pretest odds by said likelihood ratio to determine the post-test odds of disease for each individual;
  
  (e) converting a set of posttest odds into posttest probabilities for each methodology and creating an ROC curve for each methodology by altering its respective post-test probability cutoff value;
  
  (f) comparing the ROC areas generated by one or more regression techniques to determine an optimal methodology, comprising the tests to be included in an optimum test panel and the weight to be assigned each test score alone or in combination;
  
  (g) dichotomizing the optimal methodology by finding that point on the final ROC graph tangent to a line with a slope of (1−
  
  p)·
  
  C/p·
  
  B, where p is the population prevalence of disease, B is the regret associated with failing to treat patients with disease and C is the regret associated with treating a patient without disease;
  
  thereby generating a posttest probability cutoff value; and
  
  (h) displaying the optimum test panel for disease diagnosis, the weight each individual test score is to be assigned alone or in combination, and the cutoff value against which positive or negative diagnoses are to be made.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 22, 23, 24, 25)
- - 2. The method of claim 1, wherein t₀is the maximum false-positive rate desired by a physician interpreting the tests and is a multiple of 1/n^H, and the β
    - coefficients and test panel are chosen simultaneously through partial ROC regression, so that the largest area below the partial ROC curve for the (1−
      
      t₀) quantile of individuals without D is generated, wherein β
      
      ^TY_j>
      
      c and the survival function of patients without disease with a score of c, S_H(c), is equal to t₀.
  - 3. The method of claims 1 and 2, wherein the ROC curve is smoothed using the sigmoid function:
    - S(x)=1/[1+exp(−
      
      x)], wherein bias is decreased in estimating x values close to zero by introducing a series of positive numbers σ
      
      _n, satisfying the condition that σ
      
      _napproaches zero as n approaches infinity, such that S_n(x)=S(x/σ
      
      _n), wherein the optimal β
      
      is determined using the sigmoid approximation as the sigmoid maximum rank correlation estimator;
      
      $β (optimal) = \arg \max {R_{n} (β) = \frac{1}{n^{D} \cdot n^{H}} \sum_{i D, j H}^{} S_{n} [β^{T} (Y_{i} - Y_{j})]},$ wherein a LASSO tuning parameter, L₁constraint ≦
      
      u, is determined using a V-fold cross validation technique.
  - 4. The method of claim 1, wherein the optimized score function β
    - ^TY generates a score, c_ifor each patient i with D, and c_j, for each control patient j, wherein the likelihood ratios for scores c_iand c_j, P(c_i/D)/P(c_i/D′
      
      ) and P(c_j/D)/P(c_j/D′
      
      ), respectively, are monotone increasing in patients with D.
  - 5. The method of claims 1, 2 and 4, wherein when there is insufficient data to determine the pretest risk that a patient has a disease D and a laboratory reports the likelihood ratio and cutoff value for that patient'"'"'s test results directly to the physician, the cutoff value for the likelihood ratio is determined by observing the likelihood ratio resulting in 99% specificity in a control population of patients and likelihood ratios that exceed the cutoff value thus derived indicate that there is a high probability of disease D.
  - 6. The method of claim 1, wherein said disease is Lyme Disease (LD).
  - 7. The method of claim 1, the pretest risk of D is calculated using an individual'"'"'s clinical signs and symptoms
  - 8. The method of claim 1, wherein the pretest risk of D is calculated using the distribution of disease manifestation in the population from which said individuals are selected.
  - 9. The method of claim 1, wherein the posterior odds of D are calculated by multiplying the pretest odds of D by the likelihood ratio associated with the score generated by the patient'"'"'s test results;
    - and where the posterior odds of D are converted into the posttest probability of D by calculating odds/[1+odds].
  - 10. The method of claim 1, wherein an ROC regression approximation is performed in step (c) and is selected from logistic regression, log-likelihood regression, linear regression, or discriminant techniques.
  - 11. The method of claim 1, further comprising the step of substituting at least a portion of said optimal methodology in another multivariate regression technique using less optimum methodology.
  - 12. The method of claim 11, wherein said multivariate regression technique is selected from logistic regression, log-likelihood regression, linear regression, or discriminant techniques.
  - 13. The method of claims 1, wherein said disease is selected from a connective tissue disease diagnosed by a plurality of tests, Rocky Mountain Spotted Fever, Babesia microti or Anaplasma granulocytophilia.
  - 14. The method of claim 1, wherein the disease is Lupus erythematosis and the ARA diagnostic criteria for Lupus erythematosis are used to determine the pretest probability of disease.
  - 15. A diagnostic test panel for diagnosing a disease for which a plurality of test methods are individually inadequate comprising laboratory tests selected by the method of claim 1.
  - 16. The diagnostic test panel of claim 15, wherein said disease is Lyme Disease and said test panel includes a plurality of tests, one or more of which are selected from the group consisting of a test for the V1sE1 IgG antibody, a test for the C6 IgG antibody, a test for the pepC10 IgM antibody and a test for the BmpA peptide.
  - 17. A kit comprising a diagnostic test panel for diagnosing a disease for which a plurality of test methods are individually inadequate and software comprising code embodying a computer-based method with which weighted results from said test panel are scored against a cutoff value to provide a diagnosis, wherein the laboratory tests of said test panel, the weight to be assigned to each test or combination thereof and cutoff value above which individuals tested have said disease are determined by the method of claim 1.
  - 18. The kit of claim 17, wherein said disease is Lyme Disease and said test panel comprises a plurality of test,s one or more of which are selected from the group consisting of a test for the V1sE1 IgG antibody, a test for the C6 IgG antibody, a test for the pepC10 IgM antibody and a test for the BmpA.
  - 19. A computer based method for diagnosing a disease for which a plurality of test methods are individually inadequate, said method comprising combining weighted scores from a panel of laboratory test results, comparing the combined weighted results to a cutoff value and displaying a diagnosis based on said comparison to said cutoff value, wherein said laboratory tests, the weighting assigned thereto and cutoff value above which individuals tested have said disease are determined by the method of claim 1.
  - 20. The computer-based method of claim 19, wherein said disease is Lyme Disease and said test panel comprises a plurality of tests, one or more of which are selected from the group consisting of a test for the V1sE1 IgG antibody, a test for the C6 IgG antibody, a test for the pepC10 IgM antibody and a test for the BmpA peptide.
  - 22. The method of claim 1 wherein the test is a Western blot and the disease is Lyme disease.
  - 23. A method of diagnosing a disease for which a plurality of tests are individually inadequate, comprising testing a biological sample from a patient suspected of having said disease and using the kit of claim 17 and reporting to said patient'"'"'s physician the scored test results and cutoff value.
  - 24. The method of claim 23, wherein said disease is Lyme Disease.
  - 25. The method of claim 24, wherein the test panel of said kit comprises a plurality of tests, one or more of which are selected from the group consisting of a test for the V1sE1 IgG antibody, a test for the C6 IgG antibody, a test for the pepC10 IgM antibody and a test for the BmpA peptide.

21. A diagnostic test panel for Lyme Disease comprising a plurality of tests, one or more of which are selected from the group consisting of a test for the V1sE1 IgG antibody, a test for the C6 IgG antibody, a test for the pepC10 IgM antibody and a test for the BmpA peptide.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
New York Medical College (Touro College & University System)
Original Assignee
Richard Porwancher
Inventors
Porwancher, Richard

Granted Patent

US 8,005,627 B2
Time in Patent Office

Days
Field of Search
US Class Current

436/513
CPC Class Codes

G16B 20/00   ICT specially adapted for f...

G16B 40/00   ICT specially adapted for b...

G16H 50/20   for computer-aided diagnosi...

G16H 50/50   for simulation or modelling...

Y02A 90/10   Information and communicati...

Bioinformatic Approach to Disease Diagnosis

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

25 Claims

Specification

Solutions

Use Cases

Quick Links

Bioinformatic Approach to Disease Diagnosis

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

25 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links