Synthetic genes
First Claim
1. A synthetic gene encoding a polypeptide segment that corresponds to a reference polypeptide segment encoded by a naturally occurring gene, wherein the polypeptide segment-encoding sequence of the synthetic gene is different from the polypeptide segment-encoding sequence of said naturally occurring gene, wherein a) said polypeptide segment-encoding sequence of said synthetic gene is less than about 90% identical to said polypeptide segment-encoding sequence of said naturally occurring gene, and/or b) said polypeptide segment-encoding sequence of said synthetic gene comprises at least one unique restriction site that is not present or is not unique in the polypeptide segment-encoding sequence of said naturally occurring gene, and/or c) said polypeptide segment-encoding sequence of said synthetic gene is free from at least one restriction site that is present in the polypeptide segment-encoding sequence of said naturally occurring gene.
1 Assignment
0 Petitions
Accused Products
Abstract
The invention provides strategies, methods, vectors, reagents, and systems for production of synthetic genes, production of libraries of such genes, and manipulation and characterization of the genes and corresponding encoded polypeptides. In one aspect, the synthetic genes can encode polyketide synthase polypeptides and facilitate production of therapeutically or commercially important polyketide compounds.
60 Citations
64 Claims
-
1. A synthetic gene encoding a polypeptide segment that corresponds to a reference polypeptide segment encoded by a naturally occurring gene, wherein the polypeptide segment-encoding sequence of the synthetic gene is different from the polypeptide segment-encoding sequence of said naturally occurring gene, wherein
a) said polypeptide segment-encoding sequence of said synthetic gene is less than about 90% identical to said polypeptide segment-encoding sequence of said naturally occurring gene, and/or b) said polypeptide segment-encoding sequence of said synthetic gene comprises at least one unique restriction site that is not present or is not unique in the polypeptide segment-encoding sequence of said naturally occurring gene, and/or c) said polypeptide segment-encoding sequence of said synthetic gene is free from at least one restriction site that is present in the polypeptide segment-encoding sequence of said naturally occurring gene.
-
7. A synthetic gene encoding a polypeptide segment that corresponds to a reference polypeptide segment encoded by a naturally occurring PKS gene, wherein the polypeptide segment-encoding sequence of the synthetic gene is different from the polypeptide segment-encoding sequence of said naturally occurring PKS gene and comprises at least two of:
-
a) a Spe I site near the sequence encoding the amino-terminus of the module;
b) a Mfe I site near the sequence encoding the amino-terminus of a KS domain;
c) a Kpn I site near the sequence encoding the carboxy-terminus of a KS domain;
d) a Msc I site near the sequence encoding the amino-terminus of an AT domain;
e) a Pst I site near the sequence encoding the carboxy-terminus of an AT domain;
f) a BsrB I site near the sequence encoding the amino-terminus of an ER domain;
g) an Age I site near the sequence encoding the amino-terminus of a KR domain;
h) an Xba I site near the sequence encoding the amino-terminus of an ACP domain.
-
- 16. A gene library comprising a plurality of different PKS module-encoding genes, wherein the module-encoding genes in the library have at least one restriction site in common, said restriction site is found no more than one time in each module, and the modules encoded in said library correspond to modules from five or more different polyketide synthase proteins.
-
22. A cloning vector comprising, in the order shown,
a) SM4-SIS-SM2-R1 or b) L-SIS-SM2-R1 where SIS is a synthon insertion site, SM2 is a sequence encoding a first selectable marker, SM4 is a sequence encoding a second selectable marker different from the first, R1 is a recognition site for a restriction enzyme, and L is a recognition site for a different restriction enzyme.
-
27. A vector comprising
a) SM4-2S1-Sy1-2S2-SM2-R1 or b) L -2S1-Sy2-2S2-SM2-R1 where 2S1 is a recognition site for first Type IIS restriction enzyme, where 2S2 is a recognition site for a different Type IIS restriction enzyme, and Sy is synthon coding region.
-
30. A composition comprising a cognate pair of vectors, wherein said cognate pairs are:
-
a) a first vector comprising SM42-2S1-Sy1-2S2-SM2-R1 digested with a Type IIS restriction enzyme that recognizes 2S2, and
a second vector comprising SM5-2S3-Sy2-2S4-SM3-R1 digested with a Type IIS restriction enzyme that recognizes 2S3;
orb) a first vector comprising L-2S1-Sy1-2S2-SM2-R1 digested with a Type IIS restriction enzyme that recognizes 2S2, and
a second vector comprising L′
-2S3-Sy2-2S4-SM3-R1 digested with a Type IIS restriction enzyme that recognizes 2S3;
wherein SM1, SM2, SM3, SM4 are sequences encoding different selection markers, R1 is a recognition site for a restriction enzyme, L and L′
are recognition sites that are the same or the same or different, and each different from R1, 2S1, 2S2′
2S3, and 2S4 are recognition sites for Type IIS restriction enzymes, wherein 2S1, 2S2 are not the same, 2S3, and 2S4 are not the same, and digestion of the first vector with 2S2 and the second vector with 2S3 results in compatible ends. - View Dependent Claims (31, 32)
-
-
33. A vector comprising a first selectable marker, a restriction site (R1) recognized by a first restriction enzyme, and a synthon coding region flanked by a restriction site recognized by a first Type IIS restriction enzyme and a restriction site recognized by a second Type IIS restriction enzyme
wherein digestion of the vector with said first restriction enzyme and said first Type IIS restriction enzyme produces a fragment comprising said first selectable marker and said synthon coding region, and digestion of the vector with said first restriction enzyme and said second Type IIS restriction enzyme produces a fragment comprising said synthon coding region and not comprising said first selectable marker.
-
34. A method for joining a series of DNA units using a vector pair comprising
a) providing a first set of DNA units, each in a first-type selectable vector comprising a first selectable marker and providing a second set of DNA units, each in a second-type selectable vector comprising a second selectable marker different from the first, wherein said first-type and second-type selectable vectors can be selected based on the different selectable markers, b) recombinantly joining a DNA unit from the first set with an adjacent DNA unit from the second set to generate a first-type selectable vector comprising a third DNA unit, and obtaining a desired clone by selecting for the first selectable marker c) recombinantly joining the third DNA unit with an adjacent DNA unit from the second set to generate a first-type selectable vector comprising a fourth DNA unit, and obtaining a desired clone by selecting for the first selectable marker, or recombinantly joining the third DNA unit with an adjacent DNA unit from the second series to generate a second-type selectable vector comprising a fourth DNA unit, and obtaining a desired clone by selecting for the second selectable marker.
-
38. A method for joining several DNA units in sequence, said method comprising
a) carrying out a first round of stitching comprising ligating an acceptor vector fragment comprising a first synthon SA0, a ligatable end LA0 at the junction end of synthon SA0 and an adjacent synthon SD0, and another ligatable end la0, and a donor vector fragment comprising a second synthon SD0, a ligatable end LD0 at the junction end of synthon SD0 and synthon SA0, wherein LD0 and LA0 are compatible, another ligatable end ld0, wherein ld0 and la0 are compatible, and a selectable marker, wherein LA0 and LD0 are ligated and la0 and ld0 are ligated, thereby joining said first and second synthons, and thereby generating a first vector comprising synthon coding sequence S1; -
b) selecting for said first vector by selecting for the selectable marker in (a); and
,c) carrying out a number n additional rounds of stitching, wherein n is an integer from 1 to 20, wherein Sn is the synthon coding sequence generated by joining synthons in the previous round of stitching, and wherein each round n of stitching comprises;
1) designating said first or a subsequent vector as either an acceptor vector An or a donor vector Dn 2) digesting acceptor vector An with restriction enzymes to produce an acceptor vector fragment comprising a synthon coding sequence Sn, a ligatable end LAn at the junction end of synthon Sn and an adjacent synthon SDn+100, and another ligatable end lan; and
,ligating the acceptor vector fragment to a donor vector fragment comprising synthon SDn+100, a ligatable end LDn+100 at the junction end of synthon SDn+100 and synthon Sn, wherein LAn and LDn+100 are compatible. another ligatable end ldn+100, wherein lan and ldn+100 are compatible, and a selectable marker, wherein LAn and LDn+100 are ligated and lan and ldn+100 are ligated, thereby generating a subsequent vector, or digesting donor vector Dn with restriction enzymes to produce a donor vector fragment comprising a synthon coding sequence Sn, a ligatable end LDn at the junction end of synthon Sn and an adjacent synthon SAn+100, another ligatable end ldn, and a selectable marker; and
ligating the donor vector fragment to an acceptor vector fragment comprising synthon SAn+100, a ligatable end LAn+100 at the junction end of synthon SAn+100 and synthon Sn, and another ligatable end lan+100 wherein LAn+100 and LDn are compatible and are ligated and lan+100 and ldn are compatible and are ligated, thereby generating a subsequent vector d) selecting the subsequent vector by selecting for the selectable marker of said donor vector fragment of step (c) e) repeating steps (c) and (d) n−
1 times thereby producing a multisynthon.
-
-
44. A method for making a synthetic gene encoding a PKS module, comprising
(i) producing a plurality of DNA units by assembly PCR, wherein each DNA unit encodes a portion of said PKS module; (ii) combining said plurality of DNA units in a predetermined sequence to produce PKS module-encoding gene. - View Dependent Claims (45)
-
46. A method for identifying restriction enzyme recognition sites useful for design of synthetic genes, comprising the steps of
obtaining amino acid sequences for a plurality of functionally related polypeptide segments; -
reverse-translating said amino acid sequences to produce multiple polypeptide segment-encoding nucleic acid sequences for each polypeptide segment;
identifying restriction enzyme recognition sites that are found in at least one polypeptide segment-encoding nucleic acid sequence of at least about 50% of said polypeptide segments. - View Dependent Claims (47, 48)
-
-
49. A method for high throughput synthesis of a plurality of different DNA units comprising different polypeptide encoding sequences comprising:
- for each DNA unit, performing polymerase chain reaction (PCR) amplification of a plurality of overlapping oligonucleotides to generate a DNA unit encoding a polypeptide segment and adding UDG-containing linkers to the 5′ and
3′
ends of the DNA unit by PCR amplification, thereby generating a Tinkered DNA unit, wherein the same UDG-containing linkers are added to said different DNA units. - View Dependent Claims (50)
- for each DNA unit, performing polymerase chain reaction (PCR) amplification of a plurality of overlapping oligonucleotides to generate a DNA unit encoding a polypeptide segment and adding UDG-containing linkers to the 5′ and
-
51. A method for designing a synthetic gene, the method comprising the steps of:
-
providing a reference amino acid sequence;
reverse translating the amino acid sequence to a randomized nucleotide sequence which encodes the amino acid sequence using a random selection of codons which have been, optionally, optimized for a codon preference of a host organism;
providing one or more parameters for positions of restriction sites on a sequence of the synthetic gene;
removing occurrences of one or more selected restriction sites from the randomized nucleotide sequence; and
inserting one or more selected restriction sites at selected positions in the randomized nucleotide sequence to generate a sequence of the synthetic gene. - View Dependent Claims (52, 53, 54, 55, 56)
-
-
57. A system for designing a synthetic gene, including a computer processor configured to:
-
provide a reference amino acid sequence;
reverse translate the amino acid sequence to a randomized nucleotide sequence which encodes the amino acid sequence using a random selection of codons which have been, optionally, optimized for a codon preference of a host organism;
provide one or more parameters for positions of restriction sites on a sequence of the synthetic gene;
remove occurrences of one or more selected restriction sites from the randomized nucleotide sequence;
insert one or more selected restriction sites at selected positions in the randomized nucleotide sequence to generate a sequence of the synthetic gene; and
generate a set of overlapping oligonucleotide sequences which together comprise a sequence of the synthetic gene.
-
-
58. A computer readable storage medium containing computer executable code for designing a synthetic gene by instructing a computer to operate as follows:
-
provide a reference amino acid sequence;
reverse translate the amino acid sequence to a randomized nucleotide sequence which encodes the amino acid sequence using a random selection of codons which have been, optionally, optimized for a codon preference of a host organism;
provide one or more parameters for positions of restriction sites on a sequence of the synthetic gene;
remove occurrences of one or more selected restriction sites from the randomized nucleotide sequence;
insert one or more selected restriction sites at selected positions in the randomized nucleotide sequence to generate a sequence of the synthetic gene; and
generate a set of overlapping oligonucleotide sequences which together comprise a sequence of the synthetic gene.
-
-
59. A method for analyzing a nucleotide sequence of a synthon, the method comprising:
-
providing a sequence of a synthetic gene, wherein the synthetic gene is divided into a plurality of synthons;
providing sequences of a plurality of synthon samples wherein each synthon of the plurality of synthons is cloned in a vector;
providing a sequence of the vector without an insert;
eliminating vector sequences from the sequence of the cloned synthon;
constructing a contig map of sequences of the plurality of synthons;
aligning the contig map of sequences with the sequence of the synthetic gene; and
identifying a measure of alignment for each of the plurality of synthons. - View Dependent Claims (60)
-
-
61. A system for high through-put synthesis of synthetic genes comprising:
-
at least one source microwell plate containing oligonucleotides for assembly PCR a source for an assembly PCR amplification mixture a source for LIC extension primer mixture at least one PCR microwell plate for amplification of oligonucleotides a liquid handling device which retrieves a plurality of predetermined sets of oligonucleotides from the source microwell plate(s) combines the predetermined sets and the amplification mixture in wells of the at least one PCR microwell plate;
retrieves LIC extension primer mixture; and
combines the LIC extension primer mixture and amplicons in a well of the at least one PCR microwell plate; and
a heat source for PCR amplification configured to accept the at least one PCR microwell plate.
-
-
63. An open reading frame vector having a structure selected from
a) Internal type: - 4-[7-*]-[*-8]-3;
b) Left-edge type;
4-[7-1]-[*-8]-3; and
c) Right-edge type;
4-[7-*]-[6-8]-3;
wherein 7 and 8 are recognition sites for Type IIS restriction enzymes which cut to produce compatible overhangs “
*”
;
1 and 6 are Type II restriction enzyme sites that are optionally present; and
3 and 4 are recognition sites for restriction enzymes with 8-basepair recognition sites. - View Dependent Claims (64)
- 4-[7-*]-[*-8]-3;
Specification