System, method, device, and computer program product for extraction, gathering, manipulation, and analysis of peak data from an automated sequencer
First Claim
1. A method for high throughput analysis of data sets generally described by sets of peaks, each set of peaks having been extracted from an electrophoregram profile j of a biological sample k which has been amplified for a particular sequence of nucleotide, and in each set of peaks, the ith peak is characterized by a nucleotide length Li,j,k and an area Ai,j,k, the method comprising using bioinformatics tools comprising a computer to extract and smooth peak data sets according to parameter files and store them in data files, wherein smoothing comprises steps of:
- for each peak of a set of peaks, calculating an Euclidian division using the integer 3 of Li,j,kλ
j with the remainder being assigned to an element of {−
1 0 1} wherein λ
j is a theoretical length of the amplified sequence of nucleotide, andif the mean of reminders is superior to a first predefined threshold, shifting all peaks of the set of peaks by −
1 nucleotide length, and if the mean of reminders is inferior to a second predefined threshold, shifting all peaks of the set of peaks +1 nucleotide length.
1 Assignment
0 Petitions
Accused Products
Abstract
A system, method, device, and computer program product to extract and gather peak information from an automated sequencer of bioinformatics into a peak database, and to manipulate and analyze the peak information within the database.
9 Citations
24 Claims
-
1. A method for high throughput analysis of data sets generally described by sets of peaks, each set of peaks having been extracted from an electrophoregram profile j of a biological sample k which has been amplified for a particular sequence of nucleotide, and in each set of peaks, the ith peak is characterized by a nucleotide length Li,j,k and an area Ai,j,k, the method comprising using bioinformatics tools comprising a computer to extract and smooth peak data sets according to parameter files and store them in data files, wherein smoothing comprises steps of:
-
for each peak of a set of peaks, calculating an Euclidian division using the integer 3 of Li,j,kλ
j with the remainder being assigned to an element of {−
1 0 1} wherein λ
j is a theoretical length of the amplified sequence of nucleotide, andif the mean of reminders is superior to a first predefined threshold, shifting all peaks of the set of peaks by −
1 nucleotide length, and if the mean of reminders is inferior to a second predefined threshold, shifting all peaks of the set of peaks +1 nucleotide length. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A method for high throughput analysis of data sets generally described by sets of peaks, each set of peaks having been extracted from an electrophoregram profile j of a DNA sample k which has been amplified for a particular sequence of nucleotide, and in each set of peaks, the ith peak is characterized by a nucleotide length Li,j,k and an area Ai,j,k,
wherein said method comprises using bioinformatics tools comprising a computer to extract and smooth peak data sets according to parameter files and store them in data files, wherein extracting comprises the steps of: -
for a plurality of data files (PICTfiles), each data file storing one set of peaks, generating an associated parameter file (CGEL parameter file) storing, for each data file, an order parameter (mNewOrder), reading successively the data files following the order parameters (mNewOrder) stored in the parameter file (CGEL parameter file), for each data file being read, extracting the nucleotide length Li,j,k and area Ai,j,k of the peaks of the set of peaks stored in the data file, and generating a raw data file (data
0) gathering all sets of peaks ordered according to the order parameters (mNewOrder). - View Dependent Claims (18, 19, 20, 21)
-
-
22. A method for high throughput analysis of data sets characterized by sets of peaks extracted from an electrophoregram profile comprising:
-
extracting and smoothing peak data sets using a computer according to parameter files and storing them in data files, wherein smoothing comprises; for each peak of a set of peaks, calculating an Euclidian division using the integer 3 of Li,j,kλ
j with the remainder being assigned to an element of {−
1 0 1} wherein λ
j is a theoretical length of the amplified sequence of nucleotide, andif the mean of reminders is superior to a first predefined threshold, shifting all peaks of the set of peaks by −
1 nucleotide length, and if the mean of reminders is inferior to a second predefined threshold, shifting all peaks of the set of peaks +1 nucleotide length;wherein said data sets characterized by sets of peaks extracted from an electrophoregram profile j of a biological sample k which has been amplified for a particular sequence of nucleotide, and in each set of peaks, the ith peak is characterized by a nucleotide length Li,j,k and an area Ai,j,k. - View Dependent Claims (23, 24)
-
Specification