Program for microarray design and analysis
First Claim
1. A computer-based system for creating a targeted collection of sequences from a dataset comprising sequence identifiers corresponding to natural complex biopolymer sequences and linked to corresponding annotations, the system comprising:
- a) a search function which searches the annotations of the dataset according to a user-defined criterion and outputs a first subset of the dataset restricted by the criterion;
b) a redundancy reducing function which compares the first subset with a first database correlating the sequence identifiers of the first subset with syngeneic biopolymers and outputs a second subset of the dataset having reduced unique, natural complex biopolymer redundancy relative to the first subset;
c) a selection function which applies to the second subset a user-defined selection parameter and outputs a third subset restricted relative to the second subset by the parameter; and
d) a tabulation function which creates and outputs the targeted collection of sequences in the form of a data table comprising, configurable by and sortable by the sequence identifiers of the third subset.
1 Assignment
0 Petitions
Accused Products
Abstract
The invention relates to computer-based systems and methods for the design, comparison and analysis of genetic and proteomic databases. In a particular embodiment, the recited systems and methods have been implemented in a computer tool called ARROGANT. ARROGANT, in the analysis mode, is a comprehensive tool for providing annotation to large gene and protein collections. ARROGANT takes in a large collection of sequence identifiers and associates it with other information collected from many sources like sequence annotations, pathways, homology, polymorphisms, artifacts, etc. The simultaneous annotation for a large assembly of genes makes the collection of genomic/EST sequences truly informative.
36 Citations
24 Claims
-
1. A computer-based system for creating a targeted collection of sequences from a dataset comprising sequence identifiers corresponding to natural complex biopolymer sequences and linked to corresponding annotations, the system comprising:
-
a) a search function which searches the annotations of the dataset according to a user-defined criterion and outputs a first subset of the dataset restricted by the criterion;
b) a redundancy reducing function which compares the first subset with a first database correlating the sequence identifiers of the first subset with syngeneic biopolymers and outputs a second subset of the dataset having reduced unique, natural complex biopolymer redundancy relative to the first subset;
c) a selection function which applies to the second subset a user-defined selection parameter and outputs a third subset restricted relative to the second subset by the parameter; and
d) a tabulation function which creates and outputs the targeted collection of sequences in the form of a data table comprising, configurable by and sortable by the sequence identifiers of the third subset. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 21, 22, 23, 24)
-
-
13. A computer-based method for creating a targeted collection of sequences from a dataset comprising sequence identifiers corresponding to natural complex biopolymer sequences and linked to corresponding annotations, the method comprising computer-implemented steps of:
-
a) searching with a computer the annotations of the dataset according to a user-defined criterion and outputting a first subset of the dataset restricted by the criterion;
b) comparing with the computer the first subset with a database correlating the sequence identifiers of the first subset with syngeneic biopolymers and outputting a second subset of the dataset having reduced unique, natural complex biopolymer redundancy relative to the first subset;
c) applying to the second subset a user-defined selection parameter and outputting a third subset restricted relative to the second subset by the parameter; and
d) creating and outputting the targeted collection of sequences in the form of a data table comprising, configurable by and sortable by the sequence identifiers of the third subset
-
-
14. A computer-based system for creating a targeted collection of sequences from a plurality of datasets comprising sequence identifiers corresponding to natural complex biopolymer sequences, the system comprising:
-
a) a merge and redundancy reducing function which compares the datasets with a database correlating the sequence identifiers with syngeneic biopolymers and creates a subset of the sum of the datasets having reduced unique, natural complex biopolymer redundancy relative to the sum; and
b) a tabulation function which creates and outputs the targeted collection of sequences in the form of a data table comprising, configurable by and sortable by the sequence identifiers of the subset. - View Dependent Claims (15, 16)
-
-
17. A computer-based method for creating a targeted collection of sequences from a plurality of datasets comprising sequence identifiers corresponding to natural complex biopolymer sequences, the method comprising computer-implemented steps of:
-
a) comparing the datasets with a database correlating the sequence identifiers with syngeneic biopolymers and creating a subset of the sum of the datasets having reduced unique, natural complex biopolymer redundancy relative to the sum; and
b) creating and outputting the targeted collection of sequences in the form of a data table comprising, configurable by and sortable by the sequence identifiers of the subset.
-
-
18. A computer-based system for creating a targeted collection of sequences from a dataset comprising sequence identifiers corresponding to natural complex biopolymer sequences and linked to corresponding first annotations, the system comprising:
-
a) an integration function which merges the dataset with a database comprising second annotations attributable to and correlated with at least a subset of the sequence identifiers or sequences of the dataset and which links the second annotations to the corresponding sequence identifiers of the subset; and
b) a tabulation function which creates and outputs the targeted collection of sequences in the form of a data table comprising, configurable by and sortable by the sequence identifiers of the subset and the second annotations. - View Dependent Claims (19)
-
-
20. A computer-based method for creating a targeted collection of sequences from a dataset comprising sequence identifiers corresponding to natural complex biopolymer sequences and linked to corresponding first annotations, the method comprising computer-implemented steps of:
-
a) merging the dataset with a database comprising second annotations attributable to and correlated with at least a subset of the sequence identifiers or sequences of the dataset and linking the second annotations to the corresponding sequence identifiers of the subset; and
b) creating and outputting the targeted collection of sequences in the form of a data table comprising, configurable by and sortable by the sequence identifiers of the subset and the second annotations.
-
Specification