Method and apparatus for approximate matching of DNA sequences
First Claim
1. A computer-implemented method for determining whether a DNA query sequence is an approximate match to a DNA sequence within a library of DNA sequences, the method comprising:
- streaming the DNA sequences of the library through programmable logic that has been loaded with a key, wherein the key corresponds to a DNA query sequence; and
comparing the streaming DNA sequences with the key using the programmable logic to thereby identify any approximate matches that exist between the key and the streaming DNA sequences, wherein the comparing step comprises;
continuously computing a correlation coefficient between the key and a sliding window of the streaming DNA sequences using the programmable logic, andjudging each computed correlation coefficient against a threshold value to thereby identify an approximate match between the key and the streaming DNA sequences.
3 Assignments
0 Petitions
Accused Products
Abstract
A method and device are disclosed for an associative and approximate, analog or digital scanning of databases that allows for the asynchronous accessing of data from a mass storage medium. The invention includes providing dedicated analog and digital circuitry and decision logic at the mass storage medium level for determining a key identifying the data of interest, continuously comparing the key to a signal generated from a reading of the data from the mass storage medium with an approximate or exact matching circuit to determine a pattern match, determining a correlation value between the key and the data as it is read in a continuous fashion, and determining a match based upon a preselected threshold value for the correlation value. The pattern matching technique eliminates any need to compare data based on its intrinsic structure or value, and instead is based on an analog or digital pattern. The key and data may be either analog or digital. This device and method may be provided as part of a stand-alone computer system, embodied in a network attached storage device, or can otherwise be provided as part of a computer LAN or WAN.
-
Citations
24 Claims
-
1. A computer-implemented method for determining whether a DNA query sequence is an approximate match to a DNA sequence within a library of DNA sequences, the method comprising:
-
streaming the DNA sequences of the library through programmable logic that has been loaded with a key, wherein the key corresponds to a DNA query sequence; and comparing the streaming DNA sequences with the key using the programmable logic to thereby identify any approximate matches that exist between the key and the streaming DNA sequences, wherein the comparing step comprises; continuously computing a correlation coefficient between the key and a sliding window of the streaming DNA sequences using the programmable logic, and judging each computed correlation coefficient against a threshold value to thereby identify an approximate match between the key and the streaming DNA sequences. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer-implemented method for determining whether a DNA query sequence is an approximate match to a DNA sequence within a library of DNA sequences, the method comprising:
-
streaming the DNA sequences of the library through programmable logic that has been loaded with a key, wherein the key corresponds to a DNA query sequence; comparing the streaming DNA sequences with the key using the programmable logic to thereby identify any approximate matches that exist between the key and the streaming DNA sequences based on an adjustable threshold; and adjusting the threshold to control a degree of approximate matches which is identified as a result of the comparing step such that a forgivable number of residue mismatches may exist between the key and a window of the streaming DNA sequences while still qualifying as an approximate match. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13)
-
-
14. An apparatus for determining whether a DNA query sequence is an approximate match to a DNA sequence that is stored by a mass storage medium as part of a library of DNA sequences, the apparatus comprising:
an approximate matching unit in communication with a mass storage medium, the approximate matching unit comprising programmable logic, the programmable logic being configured to (1) store a key, the key corresponding to a DNA query sequence, (2) receive a stream of DNA sequences from the mass storage medium, (3) continuously compute a correlation coefficient between the key and a sliding window of the received DNA sequences, and (4) judge the computed correlation coefficients against a threshold value to thereby identify whether any approximate matches exist between the key and the received DNA sequences. - View Dependent Claims (15, 16)
-
17. An apparatus for determining whether a DNA query sequence is an approximate match to a DNA sequence that is stored by a mass storage medium as part of a library of DNA sequences, the apparatus comprising:
an approximate matching unit in communication with a mass storage medium, the approximate matching unit comprising programmable logic, the programmable logic having a key loaded thereon, wherein the key corresponds to a DNA query sequence, the approximate matching unit being configured to (1) stream the DNA sequences of the library through the programmable logic, (2) compare the streaming DNA sequences with the key using the programmable logic to thereby identify any approximate matches that exist between the key and the streaming DNA sequences based on an adjustable threshold, and (3) adjust the threshold to control a degree of approximate matches which is identified as a result of the comparison operation such that a forgivable number of residue mismatches may exist between the key and a window of the streaming DNA sequences while still qualifying as an approximate match. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24)
Specification