Computer-aided probability base calling for arrays of nucleic acid probes on chips
First Claim
Patent Images
1. In a computer system, a method of calling an unknown base in a sample nucleic acid sequence, the method comprising the steps of:
- defining a set of potential base calls, said base calls including at least one of A, C, G, T(U), deletion, insertion, and a plurality of ambiguous calls;
inputting a plurality of measurements for hybridization between a plurality of probes and said sample nucleic acid sequence or nucleic acid derived from said sample nucleic acid sequence;
determining a plurality of probabilities, each of said probabilities reflecting the likelihood that one of the potential base calls is correct, said probabilities being calculated according to a distribution model, said hybridization measurements and sequences of said plurality of probes; and
calling the unknown base according to the probabilities of said potential base calls.
5 Assignments
0 Petitions
Accused Products
Abstract
A computer system for analyzing nucleic acid sequences is provided. The computer system is used to calculate probabilities for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes on biological chips. Additionally, information from multiple experiments is utilized to improve the accuracy of calling unknown bases.
-
Citations
34 Claims
-
1. In a computer system, a method of calling an unknown base in a sample nucleic acid sequence, the method comprising the steps of:
-
defining a set of potential base calls, said base calls including at least one of A, C, G, T(U), deletion, insertion, and a plurality of ambiguous calls; inputting a plurality of measurements for hybridization between a plurality of probes and said sample nucleic acid sequence or nucleic acid derived from said sample nucleic acid sequence; determining a plurality of probabilities, each of said probabilities reflecting the likelihood that one of the potential base calls is correct, said probabilities being calculated according to a distribution model, said hybridization measurements and sequences of said plurality of probes; and calling the unknown base according to the probabilities of said potential base calls. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A computer program that calls an unknown base in a sample nucleic acid sequence, comprising:
-
code that receives as input a plurality of hybridization probe intensities, each of the probe intensities corresponding to a nucleic acid probe; code that determines for each of the plurality of probe intensities a probability that the corresponding nucleic acid probe best hybridizes with the sample nucleic acid sequence; and code that calls the unknown base according to the nucleic acid probe with the highest associated probability; wherein the codes are stored on a tangible medium.
-
-
18. In a computer system, a method of calling an unknown base in a sample nucleic acid sequence, the method comprising the steps of:
-
inputting a plurality of base calls for the unknown base, each of the base calls having probability which represents a confidence that the known base is called correctly; selecting a base call that has a highest associated probability; and calling the unknown base according to the selected base call. - View Dependent Claims (19, 20)
-
-
21. A computer program that calls an unknown base in a sample nucleic acid sequence, comprising:
-
code that receives as input a plurality of base calls for the unknown base, each of the base calls having an associated probability which represents a confidence that the unknown base is called correctly; selecting a base call that has a highest associated probability; and calling the unknown base according to the selected base call; wherein the codes are stored on a tangible medium.
-
-
22. In a computer system, a method of calling an unknown base in a sample nucleic acid sequence, the method comprising the steps of:
-
inputting a plurality of probabilities for each possible base for the unknown base, each of the probabilities representing a probability that the unknown base is an associated base; producing a product of probabilities for each possible base, each product being associated with a possible base; and calling the unknown base according to a base associated with a highest product. - View Dependent Claims (23, 24)
-
-
25. A computer program that calls an unknown base in a sample nucleic acid sequence, comprising:
-
code that receives as input a plurality of probabilities for each possible base for the unknown base, each of the probabilities representing a probability that the unknown base is an associated base; code that produces a product of probabilities for each possible base, each product being associated with a possible base; and code that calls the unknown base according to a base associated with a highest product; wherein the codes are stored on a tangible medium.
-
-
26. In a computer system, a method of calling an unknown base in a sample nucleic acid sequence, the method comprising the steps of:
-
inputting a first base call for the unknown base, the first base call determined from a first nucleic acid probe that is equivalent to a portion of the sample nucleic acid sequence including the unknown base; inputting a second base call for the unknown base, the second base call determined from a second nucleic acid probe that is complementary to a portion of the sample nucleic acid sequence including the unknown base; selecting one of the first or second nucleic acid probes that has a base at an interrogation position which has a high probability of producing correct base calls; and calling the unknown base according to the selected one of the first or second nucleic acid probes. - View Dependent Claims (27, 28)
-
-
29. A computer program that calls an unknown base in a sample nucleic acid sequence, comprising:
-
code that receives as input first and second base calls for the unknown base, the first base call determined from a first nucleic acid probe that is equivalent to a portion of the sample nucleic acid sequence including the unknown base and the second base call determined from a second nucleic acid probe that is complementary to a portion of the sample nucleic acid sequence including the unknown base; code that selects one of the first or second nucleic acid probes that has a base at an interrogation position which has a high probability of producing correct base calls; and code that calls the unknown base according to the selected one of the first or second nucleic acid probes; wherein the codes are stored on a tangible medium.
-
-
30. In a computer system, a method for determining whether a target nucleic acid hybridizes better to one of two probes comprising the steps of:
-
obtaining a plurality of first signals from a first hybridization between said target nucleic acid and a first probe; obtaining a plurality of second signals from a second hybridization between said target nucleic acid and a second probe; and calculating a probability that said first signals are stronger than said second signals according to a signal distribution model. - View Dependent Claims (31, 32, 33, 34)
-
Specification