Data analysis methods for locating entities of interest within large, multivariable datasets

US 8,131,473 B1
Filed: 05/20/2005
Issued: 03/06/2012
Est. Priority Date: 05/20/2004
Status: Active Grant

- Alert
- Pin

First Claim

Patent Images

1. A method for locating a subset within an experimental set of biological data points that is of most interest for further analysis, the method comprising:

a) obtaining a set of data points associated with a biological phenomenon of interest, wherein the set of data points comprises a baseline and one or more experimental groups;

b) designating a trend of interest in the set of data points that is associated with the biological phenomenon of interest, wherein the trend indicates a relationship between the data points and an independent variable,c) developing a mathematical model of the trend, the mathematical model being a function of the independent variable and wherein the mathematical model models the trend with respect to the independent variable;

d) testing each data point in the set for adherence to the mathematical model, wherein the data points adhering to the model are identified as being members of the subset of most interest for further analysis; and

e) providing identification of the members of the subset in a user-readable format,wherein all of the steps b), c), d), and e) are performed on a suitably-programmed computer.

View all claims

11 Assignments

Timeline View

Assignment View

Litigations

0 Petitions

Accused Products

Abstract

The present invention provides data analysis methods for the rapid location of subsets of large, multivariable biological datasets that are of most interest for further analysis, for the investigation of molecular modes of action of biological phenomena of interest, and for the identification of sets of data points that best distinguish between experimental groups in larger datasets as putative biomarkers. While existing methods for analyzing large biological datasets generally provide too much information to the user, or not enough, the methods of the present invention entail taking user input on what kinds of trends are of interest and then finding results that match the designated trend. In such manner, the methods of the invention allow a user to quickly pinpoint the subset of data of most interest without a concomitant loss of a large percentage of relevant information, as is typical with standard methods. The methods of the invention allow for identification of molecular entities that are involved in a biological phenomenon of interest, entities that may have otherwise gone undiscovered in a large, multivariable dataset.

9 Citations

View as Search Results

17 Claims

1. A method for locating a subset within an experimental set of biological data points that is of most interest for further analysis, the method comprising:
- a) obtaining a set of data points associated with a biological phenomenon of interest, wherein the set of data points comprises a baseline and one or more experimental groups;
  
  b) designating a trend of interest in the set of data points that is associated with the biological phenomenon of interest, wherein the trend indicates a relationship between the data points and an independent variable,c) developing a mathematical model of the trend, the mathematical model being a function of the independent variable and wherein the mathematical model models the trend with respect to the independent variable;
  
  d) testing each data point in the set for adherence to the mathematical model, wherein the data points adhering to the model are identified as being members of the subset of most interest for further analysis; and
  
  e) providing identification of the members of the subset in a user-readable format,wherein all of the steps b), c), d), and e) are performed on a suitably-programmed computer.
- View Dependent Claims (2, 3, 4, 5, 6, 12, 13, 14)
- - 2. The method of claim 1 wherein the designated trend of interest is not evident in the set of data points as a whole.
  - 3. The method of claim 2 wherein the designated trend of interest is one previously observed for the biological phenomenon of interest.
  - 4. The method of claim 2 wherein the designated trend of interest is one that is expected for the biological phenomenon of interest.
  - 5. The method of claim 1 wherein the set of data points is selected from the group consisting of biochemical profiling data, gene expression profiling data, protein expression profiling data, and tissue feature data.
  - 6. The method of claim 1 wherein the set of data point are biochemical profiling data points and the biological phenomenon of interest is liver toxicity.
  - 12. The method of claim 1 wherein designating a trend relative to an independent variable comprises designating a trend relative to at least one of time and an input variable.
  - 13. The method of claim 12 wherein the input variable comprises an amount of exposure to or dose of a chemical entity.
  - 14. The method of claim 1 wherein developing a mathematical model of the trend comprises modeling the trend as a linear or quadratic function.

7. A method for investigating the molecular mode of action of a biological phenomenon of interest, the method comprising:
- a) obtaining a set of data points associated with a biological phenomenon of interest using biochemical profiling, gene expression profiling, or protein expression profiling, wherein the set of data points comprises a baseline and one or more experimental groups;
  
  b) designating a trend of interest in the set of data points that is associated with the biological phenomenon of interest, wherein the trend indicates a relationship between the data points and an independent variable;
  
  c) developing a mathematical model of the trend, the mathematical model being a function of the independent variable and wherein the mathematical model models the trend with respect to the independent variable;
  
  d) testing each data point in the set for adherence to the mathematical model;
  
  e) identifying, from a plurality of possible metabolic pathways, one or more metabolic pathways to which the data points that adhere to the model belong, wherein the mode of action of the phenomenon of interest affects the identified metabolic pathways; and
  
  f) providing identification of the identified metabolic pathways in a user-readable format,wherein all of the steps b), c), d), e), and f) are performed on a suitably-programmed computer.
- View Dependent Claims (8, 9, 10, 11, 15, 16, 17)
- - 8. The method of claim 7 wherein the designated trend of interest is not evident in the set of data points as a whole.
  - 9. The method of claim 8 wherein the designated trend of interest is one previously observed for the biological phenomenon of interest.
  - 10. The method of claim 8 wherein the designated trend of interest is one that is expected for the biological phenomenon of interest.
  - 11. The method of claim 7 wherein the set of data points is obtained using biochemical profiling and the biological phenomenon of interest is liver toxicity.
  - 15. The method of claim 7 wherein designating a trend relative to an independent variable comprises designating a trend relative to at least one of time and an input variable.
  - 16. The method of claim 15 wherein the input variable comprises an amount of exposure to or dose of a chemical entity.
  - 17. The method of claim 7 wherein developing a mathematical model of the trend comprises modeling the trend as a linear or quadratic function.

Specification

Resources

Litigation Campaign Assessment

Litigation Data

Current Assignee
Metabolon Incorporated
Original Assignee
Metabolon Incorporated
Inventors
Coffin, Marie, Allen, Keith D., Higgins, Alan J., Bullard, Brian R.
Primary Examiner(s)
Lin, Jerry

Application Number

US11/133,953
Time in Patent Office

2,482 Days
Field of Search

702/19
US Class Current

702/19
CPC Class Codes

G16B 40/00 ICT specially adapted for b...

G16B 5/00 ICT specially adapted for m...

Data analysis methods for locating entities of interest within large, multivariable datasets

First Claim

11 Assignments

Litigations

0 Petitions

Accused Products

Abstract

9 Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Data analysis methods for locating entities of interest within large, multivariable datasets

First Claim

11 Assignments

Subscription Required

Subscription Required

Litigations

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

9 Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links