Methods for identifying suitable nucleic acid normalization probe sequences for use in nucleic acid arrays
First Claim
1. A method of identifying a sequence of a nucleic acid that is suitable for use as a substrate surface immobilized normalization probe, said method comprising:
- (a) identifying a plurality of candidate probe sequences for a target nucleic acid based on at least one selection criterion;
(b) empirically evaluating each of said candidate probe sequences under a plurality of different experimental sets to obtain a collection of empirical data values for each of said candidate nucleic acid probe sequences for each of said plurality of different experimental sets;
(c) clustering said candidate probe sequences into one or more groups of candidate probe sequences based on each candidate probe sequence'"'"'s collection of empirical data values, wherein each of said one or more groups exhibits substantially the same performance across said plurality of experimental sets;
(d) evaluating any remaining non-clustering probes for candidate probe sequences that satisfy a signal intensity threshold and exhibit substantially no variation in signal under said plurality of different experimental sets to identify any candidate probe sequences of said plurality that are suitable for use as a substrate surface immobilized normalization probe.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods of identifying a sequence of a probe, e.g., a biopolymeric probe, such as a nucleic acid, that is suitable for use as a surface immobilized normalization probe on a nucleic acid array are provided. A feature of the subject methods is that a set of computationally determined initial candidate sequences are empirically evaluated to obtain functional data that is then employed to evaluate the candidate sequences for suitability as normalization probes. Sequences identified as suitable for use as normalization probes according to the subject methods are ones that do not cluster with other probes of the candidate set, exhibit high signal intensity and exhibit substantially no differential expression across a large number of samples. The subject invention also includes algorithms for performing the subject methods recorded on a computer readable medium, as well as computational analysis systems that include the same. Also provided are nucleic acid arrays produced with probes having sequences identified by the subject methods, as well as methods for using the same.
11 Citations
25 Claims
-
1. A method of identifying a sequence of a nucleic acid that is suitable for use as a substrate surface immobilized normalization probe, said method comprising:
-
(a) identifying a plurality of candidate probe sequences for a target nucleic acid based on at least one selection criterion;
(b) empirically evaluating each of said candidate probe sequences under a plurality of different experimental sets to obtain a collection of empirical data values for each of said candidate nucleic acid probe sequences for each of said plurality of different experimental sets;
(c) clustering said candidate probe sequences into one or more groups of candidate probe sequences based on each candidate probe sequence'"'"'s collection of empirical data values, wherein each of said one or more groups exhibits substantially the same performance across said plurality of experimental sets;
(d) evaluating any remaining non-clustering probes for candidate probe sequences that satisfy a signal intensity threshold and exhibit substantially no variation in signal under said plurality of different experimental sets to identify any candidate probe sequences of said plurality that are suitable for use as a substrate surface immobilized normalization probe. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
-
Specification