Method and system for determining the accuracy of DNA base identifications
First Claim
Patent Images
1. A method in a computer for determining the quality of predicted DNA base identifications, the method comprising:
- receiving a training data set, the training data set comprising a plurality of predicted DNA base identifications;
defining a group of subsets;
comparing the predicted DNA base identifications with actual DNA base identifications for training data within each subset of the group;
determining a sampling characteristic for each subset of the group based on training data within the respective subset; and
determining a quality characterization for predicted DNA base identifications within at least one of subset of the group based on the comparison and determined sampling characteristics;
wherein the sampling characteristic comprises a confidence value comprising a binomial proportion confidence interval value.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for determining the quality of predicted nucleotide base identifications by receiving training data sets of predicted base identifications; defining subsets within the training data sets; comparing the predicted base identifications with actual base identifications within each subset; determining one or more sampling characteristics for each subset; and determining quality characterizations based on the comparison and the determined sampling characteristics.
29 Citations
34 Claims
-
1. A method in a computer for determining the quality of predicted DNA base identifications, the method comprising:
-
receiving a training data set, the training data set comprising a plurality of predicted DNA base identifications; defining a group of subsets; comparing the predicted DNA base identifications with actual DNA base identifications for training data within each subset of the group; determining a sampling characteristic for each subset of the group based on training data within the respective subset; and determining a quality characterization for predicted DNA base identifications within at least one of subset of the group based on the comparison and determined sampling characteristics; wherein the sampling characteristic comprises a confidence value comprising a binomial proportion confidence interval value. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A system performed on a processor for determining the quality of DNA base identifications, the system comprising:
-
a processor; a predicted identity input component configured to receive a plurality of predicted DNA base identifications associated with a training data set; a subset generator configured to define a group of subsets;
an identity comparison component configured to compare the predicted DNA base identifications with actual DNA base identifications for training data within each subset of the group;a sampling determination component configured to determine a sampling characteristic for each subset of the group based on training data within the respective subset; and a quality characterization determination component configured to determine a quality characterization for predicted DNA base identifications within at least one of subset of the group based on the comparison and determined sampling characteristic wherein the sampling characteristic comprises a confidence value comprising a binomial proportion confidence interval value. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34)
-
Specification