Multiple-channel bias removal methods with little dependence on population size
First Claim
1. A method of removing labeling-bias factors affecting data from two or more data sources, said method comprising the steps of:
- subdividing combined respective data points from the two or more data sources into portions of the population of combined data points;
for each portion, sorting the data points that are members of that portion according to relative values of the data points that are members;
for each portion, generating a function from the sorted data points belonging to that portion;
for each function, identifying a value representative of highest population density of data points within the respective portion;
fitting the values representative of highest population densities within the portions, respectively to a predetermined curve; and
adjusting values of all data points relative to the fitted values.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods, systems and computer readable media for removing labeling-bias factors affecting data from two or more data sources after single source biasing factors have been removed to the extent possible. Respective data points from the data sources are considered in combination to generate a population of data points. The population of data points is subdivided into portions of the overall population and, for each portion, the data points are sorted within that portion, relative to values of all other data values in that portion. A function is then generated for each portion from the sorted data points for that portion. For each portion, a value representative of highest population density of data points within that portion is identified. The identified values are fitted to a predetermined curve, and values of all data points are adjusted relative to the fitted values.
-
Citations
31 Claims
-
1. A method of removing labeling-bias factors affecting data from two or more data sources, said method comprising the steps of:
-
subdividing combined respective data points from the two or more data sources into portions of the population of combined data points;
for each portion, sorting the data points that are members of that portion according to relative values of the data points that are members;
for each portion, generating a function from the sorted data points belonging to that portion;
for each function, identifying a value representative of highest population density of data points within the respective portion;
fitting the values representative of highest population densities within the portions, respectively to a predetermined curve; and
adjusting values of all data points relative to the fitted values. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 26)
-
-
18. A method of dye-normalizing array data obtained from at least two channels, said method comprising the steps of:
-
subdividing a combined population of signal values from the at least two channels into portions of the overall combined population of signal values;
for each portion, sorting the signal values that are members of that portion according to relative values of the signal values that are members;
for each portion, generating a function from the sorted signal values belonging to that portion;
for each function, identifying a value representative of highest population density of signal values within the portion;
fitting the values representative of highest population densities within the portions, respectively to a predetermined curve; and
adjusting values of all signal values relative to the fitted values based upon the adjustments made to fit the fitted values. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25)
-
-
27. A system for removing labeling-bias factors affecting data from two or more data sources, said system comprising:
-
a feature for considering respective data points from said sources in combination;
a feature for subdividing the combined respective data points into portions of the overall population of combined data points;
for each portion, a feature for sorting the data points that are members of that portion according to relative values of the data points that are members;
for each portion, a feature for generating a function from the sorted data points belonging to that portion;
for each function, a feature for identifying a value representative of highest population density of data points within the respective portion;
a feature for fitting the values representative of highest population densities within the portions, respectively to a predetermined curve; and
a feature for adjusting values of all data points relative to the fitted values. - View Dependent Claims (28, 29, 30)
-
-
31. A computer readable medium carrying one or more sequences of instructions for removing labeling-bias factors affecting data from two or more data sources, wherein execution of one or more sequences of instructions by one or more processors causes the one or more processors to perform the steps of:
-
subdividing combined respective data points into portions of the overall population of combined data points;
for each portion, sorting the data points that are members of that portion according to relative values of the data points that are members;
for each portion, generating a function from the sorted data points belonging to that portion;
for each function, identifying a value representative of highest population density of data points within the respective portion;
fitting the values representative of highest population densities within the portions, respectively to a predetermined curve; and
adjusting values of all data points relative to the fitted values.
-
Specification