Methods of populating data structures for use in evolutionary simulations
First Claim
1. A computer program product comprising a non-transitory computer readable medium, on which is stored instructions for:
- i) encoding two or more biological molecules into initial character strings to provide a collection of two or more different initial character strings wherein each of said biological molecules comprises at least ten subunits;
ii) selecting at least two substrings from said initial character strings;
iii) concatenating said substrings to form one or more product strings about the same length as one or more of the initial character strings;
iv) determining sequence identities of at least one of the product strings relative to at least one initial character string;
v) selecting one or more product biological molecules for synthesis, wherein the one or more product biological molecules are encoded by one or more of the product strings;
vi) adding additional initial character strings to the collection of the two or more different initial character strings, wherein the additional initial character strings encode segments of a subset of the one or more product biological molecules selected in v); and
vii) repeating operations ii)-v) using the collection of initial character strings, which now contains the added additional initial character strings.
2 Assignments
0 Petitions
Accused Products
Abstract
In particular, this invention provides novel methods of populating data structures for use in evolutionary modeling. In particular, this invention provides methods of populating a data structure with a plurality of character strings. The methods involve encoding two or more biological molecules into character strings to provide a collection of two or more different initial character strings; selecting at least two substrings from the pool of character strings; concatenating the substrings to form one or more product strings about the same length as one or more of the initial character strings; adding the product strings to a collection of strings; and optionally repeating this process using one or more of the product strings as an initial string in the collection of initial character strings.
126 Citations
39 Claims
-
1. A computer program product comprising a non-transitory computer readable medium, on which is stored instructions for:
-
i) encoding two or more biological molecules into initial character strings to provide a collection of two or more different initial character strings wherein each of said biological molecules comprises at least ten subunits; ii) selecting at least two substrings from said initial character strings; iii) concatenating said substrings to form one or more product strings about the same length as one or more of the initial character strings; iv) determining sequence identities of at least one of the product strings relative to at least one initial character string; v) selecting one or more product biological molecules for synthesis, wherein the one or more product biological molecules are encoded by one or more of the product strings; vi) adding additional initial character strings to the collection of the two or more different initial character strings, wherein the additional initial character strings encode segments of a subset of the one or more product biological molecules selected in v); and vii) repeating operations ii)-v) using the collection of initial character strings, which now contains the added additional initial character strings.
-
-
2. A method of identifying molecules represented by concatenated strings, said method comprising:
-
i) encoding two or more biological molecules into a data structure of initial character strings to provide a collection of two or more different initial character strings, wherein each of the two or more biological molecules comprises at least 10 subunits; ii) selecting at least two substrings from said initial character strings; iii) concatenating the at least two selected substrings to form one or more product strings; and iv) obtaining one or more product biological molecules, wherein the one or more product biological molecules are encoded by one or more of the product strings having greater a predefined value of sequence identity with at least one initial string. - View Dependent Claims (3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39)
-
Specification