Computer-aided probability base calling for arrays of nucleic acid probes on chips
First Claim
Patent Images
1. A computer program product that calls an unknown base in a sample nucleic acid sequence, comprising:
- computer code that defines a set of potential base calls, said base calls including at least one of A, C, G, T(U), deletion, insertion, and a plurality of ambiguous calls;
computer code that inputs a plurality of measurements for hybridization between a plurality of probes and said sample nucleic acid sequence or nucleic acid derived from said sample nucleic acid sequence;
computer code that determines a plurality of probabilities, each of said probabilities reflecting the likelihood that one of the potential base calls is correct, said probabilities being calculated according to a distribution model, said hybridization measurements and sequences of said plurality of probes;
computer code that calls the unknown base according to the probabilities of said potential base calls; and
a computer readable medium that stores the computer codes.
2 Assignments
0 Petitions
Accused Products
Abstract
A computer system for analyzing nucleic acid sequences is provided. The computer system is used to calculate probabilities for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes on biological chips. Additionally, information from multiple experiments is utilized to improve the accuracy of calling unknown bases.
77 Citations
22 Claims
-
1. A computer program product that calls an unknown base in a sample nucleic acid sequence, comprising:
-
computer code that defines a set of potential base calls, said base calls including at least one of A, C, G, T(U), deletion, insertion, and a plurality of ambiguous calls;
computer code that inputs a plurality of measurements for hybridization between a plurality of probes and said sample nucleic acid sequence or nucleic acid derived from said sample nucleic acid sequence;
computer code that determines a plurality of probabilities, each of said probabilities reflecting the likelihood that one of the potential base calls is correct, said probabilities being calculated according to a distribution model, said hybridization measurements and sequences of said plurality of probes;
computer code that calls the unknown base according to the probabilities of said potential base calls; and
a computer readable medium that stores the computer codes. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
computer code that produces a sum of the highest probability and a next highest probability;
computer code that compares the sum to the probability threshold; and
computer code that calls the unknown base according to the nucleic acid probes associated with the highest and next highest probabilities if the sum exceeds the probability threshold.
-
-
11. The computer program product of claim 1, wherein the sum indicates a confidence that the unknown base is called correctly if the sum exceeds the probability threshold.
-
12. An apparatus that calls an unknown base in a sample nucleic acid sequence, comprising:
-
means for defining a set of potential base calls, said base calls including at least one of A, C, G, T(U), deletion, insertion, and a plurality of ambiguous calls;
means for inputting a plurality of measurements for hybridization between a plurality of probes and said sample nucleic acid sequence or nucleic acid derived from said sample nucleic acid sequence;
means for determining a plurality of probabilities, each of said probabilities reflecting the likelihood that one of the potential base calls is correct, said probabilities being calculated according to a distribution model, said hybridization measurements and sequences of said plurality of probes; and
means for calling the unknown base according to the probabilities of said potential base calls. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
means for producing a sum of the highest probability and a next highest probability;
means for comparing the sum to the probability threshold; and
means for calling the unknown base according to the nucleic acid probes associated with the highest and next highest probabilities if the sum exceeds the probability threshold.
-
-
22. The apparatus of claim 12, wherein the sum indicates a confidence that the unknown base is called correctly if the sum exceeds the probability threshold.
Specification