Rationally Designed, Synthetic Antibody Libraries and Uses Therefor
First Claim
Patent Images
1. A library of synthetic polynucleotides, wherein said polynucleotides encode at least 106 unique antibody CDRH3 amino acid sequences comprising:
- (i) an N1 amino acid sequence of 0 to about 3 amino acids, wherein each amino acid of the N1 amino acid sequence is among the 12 most frequently occurring amino acids at the corresponding position in N1 amino acid sequences of CDRH3 amino acid sequences that are functionally expressed by human B cells;
(ii) a human CDRH3 DH amino acid sequence, N- and C-terminal truncations thereof, or a sequence of at least about 80% identity to any of them;
(iii) an N2 amino acid sequence of 0 to about 3 amino acids, wherein each amino acid of the N2 amino acid sequence is among the 12 most frequently occurring amino acids at the corresponding position in N2 amino acid sequences of CDRH3 amino acid sequences that are functionally expressed by human B cells; and
(iv) a human CDRH3 H3-JH amino acid sequence, N-terminal truncations thereof, or a sequence of at least about 80% identity to any of them.
3 Assignments
0 Petitions
Accused Products
Abstract
The present invention overcomes the inadequacies inherent in the known methods for generating libraries of antibody-encoding polynucleotides by specifically designing the libraries with directed sequence and length diversity. The libraries are designed to reflect the preimmune repertoire naturally created by the human immune system and are based on rational design informed by examination of publicly available databases of human antibody sequences.
185 Citations
72 Claims
-
1. A library of synthetic polynucleotides, wherein said polynucleotides encode at least 106 unique antibody CDRH3 amino acid sequences comprising:
-
(i) an N1 amino acid sequence of 0 to about 3 amino acids, wherein each amino acid of the N1 amino acid sequence is among the 12 most frequently occurring amino acids at the corresponding position in N1 amino acid sequences of CDRH3 amino acid sequences that are functionally expressed by human B cells; (ii) a human CDRH3 DH amino acid sequence, N- and C-terminal truncations thereof, or a sequence of at least about 80% identity to any of them; (iii) an N2 amino acid sequence of 0 to about 3 amino acids, wherein each amino acid of the N2 amino acid sequence is among the 12 most frequently occurring amino acids at the corresponding position in N2 amino acid sequences of CDRH3 amino acid sequences that are functionally expressed by human B cells; and (iv) a human CDRH3 H3-JH amino acid sequence, N-terminal truncations thereof, or a sequence of at least about 80% identity to any of them. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 40, 41, 42, 43, 44, 45, 46, 47, 60, 67, 68, 70, 71, 72)
-
-
23. A library of synthetic polynucleotides, wherein said library has a theoretical total diversity of N unique CDRH3 sequences, wherein N is about 106 to about 1015;
- and
wherein the physical realization of the theoretical total CDRH3 diversity has a size of at least about 3N, thereby providing a probability of at least about 95% that any individual CDRH3 sequence contained within the theoretical total diversity of the library is present in the actual library.
- and
-
24. A library of synthetic polynucleotides, wherein said polynucleotides encode at least about 106 unique antibody CDRH3 amino acid sequences comprising:
-
(i) an N1 amino acid sequence of 0 to about 3 amino acids, wherein; (a) the most N-terminal N1 amino acid, if present, is selected from a group consisting of R, G, P, L, S, A, V, K, I, Q, T and D; (b) the second most N-terminal N1 amino acid, if present, is selected from a group consisting of G, P, R, S, L, V, E, A, D, I, T and K; and (c) the third most N-terminal N1 amino acid, if present, is selected from the group consisting of G, R, P, S, L, A, V, T, E, D, K and F; (ii) a human CDRH3 DH amino acid sequence, N- and C-terminal truncations thereof, or a sequence of at least about 80% identity to any of them; (iii) an N2 amino acid sequence of 0 to about 3 amino acids, wherein; (a) the most N-terminal N2 amino acid, if present, is selected from a group consisting of G, P, R, L, S, A, T, V, E, D, F and H; (b) the second most N-terminal N2 amino acid, if present, is selected from a group consisting of G, P, R, S, T, L, A, V, E, Y, D and K; and (c) the third most N-terminal N2 amino acid, if present, is selected from the group consisting of G, P, S, R, L, A, T, V, D, E, Wand Q; and (iv) a human CDRH3 H3-JH amino acid sequence, N-terminal truncations thereof, or a sequence of at least about 80% identity to any of them. - View Dependent Claims (25)
-
-
26. A library of synthetic polynucleotides, wherein said polynucleotides encode at least about 106 unique antibody CDRH3 amino acid sequences that are at least about 80% identical to an amino acid sequence represented by the following formula:
-
[X]-[N1]-[DH]-[N2]-[H3-JH], wherein;(i) X is any amino acid residue or no amino acid residue; (ii) N1 is an amino acid sequence selected from the group consisting of G, P, R, A, S, L, T, V, GG, GP, GR, GA, GS, GL, GT, GV, PG, RG, AG, SG, LG, TG, VG, PP, PR, PA, PS, PL, PT, PV, RP, AP, SP, LP, TP, VP, GGG, GPG, GRG, GAG, GSG, GLG, GTG, GVG, PGG, RGG, AGG, SGG, LGG, TGG, VGG, GGP, GGR, GGA, GGS, GGL, GGT, GGV, D, E, F, H, I, K, M, Q, W, Y, AR, AS, AT, AY, DL, DT, EA, EK, FH, FS, HL, HW, IS, KV, LD, LE, LR, LS, LT, NR, NT, QE, QL, QT, RA, RD, RE, RF, RH, RL, RR, RS, RV, SA, SD, SE, SF, SI, SK, SL, SQ, SR, SS, ST, SV, TA, TR, TS, TT, TW, VD, VS, WS, YS, AAE, AYH, DTL, EKR, ISR, NTP, PKS, PRP, PTA, PTQ, REL, RPL, SAA, SAL, SGL, SSE, TGL, WGT, and combinations thereof; (iii) DH is an amino acid sequence selected from the group consisting of all possible reading frames that do not include a stop codon encoded by IGHD1-1, IGHD1-20, IGHD1-26, IGHD1-7, IGHD2-15, IGHD2-2, IGHD2-21, IGHD2-8, IGHD3-10, IGHD3-16, IGHD3-22, IGHD3-3, IGHD3-9, IGHD4-17, IGHD4-23, IGHD4-4, IGHD-4-11, IGHD5-12, IGHD5-24, IGHD5-5, IGHD-5-18, IGHD6-13, IGHD6-19, IGHD6-25, IGHD6-6, and IGHD7-27, and N- and C-terminal truncations thereof, (iv) N2 is an amino acid sequence selected from the group consisting of G, P, R, A, S, L, T, V, GG, GP, GR, GA, GS, GL, GT, GV, PG, RG, AG, SG, LG, TG, VG, PP, PR, PA, PS, PL, PT, PV, RP, AP, SP, LP, TP, VP, GGG, GPG, GRG, GAG, GSG, GLG, GTG, GVG, PGG, RGG, AGG, SGG, LGG, TGG, VGG, GGP, GGR, GGA, GGS, GGL, GGT, GGV, D, E, F, H, I, K, M, Q, W, Y, AR, AS, AT, AY, DL, DT, EA, EK, FH, FS, HL, HW, IS, KV, LD, LE, LR, LS, LT, NR, NT, QE, QL, QT, RA, RD, RE, RF, RH, RL, RR, RS, RV, SA, SD, SE, SF, SI, SK, SL, SQ, SR, SS, ST, SV, TA, TR, TS, TT, TW, VD, VS, WS, YS, AAE, AYH, DTL, EKR, ISR, NTP, PKS, PRP, PTA, PTQ, REL, RPL, SAA, SAL, SGL, SSE, TGL, WGT, and combinations thereof, and (v) H3-JH is an amino acid sequence selected from the group consisting of AEYFQH, EYFQH, YFQH, FQH, QH, H, YWYFDL, WYFDL, YFDL, FDL, DL, L, AFDV, FDV, DV, V, YFDY, FDY, DY, Y, NWFDS, WFDS, FDS, DS, S, YYYYYGMDV, YYYYGMDV, YYYGMDV, YYGMDV, YGMDV, GMDV, and MDV, or a sequence of at least 80% identity to any of them. - View Dependent Claims (27)
-
-
28. A library of synthetic polynucleotides, wherein said library consists essentially of a plurality of polynucleotides encoding CDRH3 amino acid sequences that are at least about 80% identical to an amino acid sequence represented by the following formula:
-
[X]-[N1]-[DH]-[N2]-[H3-JH], wherein;(i) X is any amino acid residue or no amino acid residue; (ii) N1 is an amino acid sequence selected from the group consisting of G, P, R, A, S, L, T, V, GG, GP, GR, GA, GS, GL, GT, GV, PG, RG, AG, SG, LG, TG, VG, PP, PR, PA, PS, PL, PT, PV, RP, AP, SP, LP, TP, VP, GGG, GPG, GRG, GAG, GSG, GLG, GTG, GVG, PGG, RGG, AGG, SGG, LGG, TGG, VGG, GGP, GGR, GGA, GGS, GGL, GGT, GGV, D, E, F, H, I, K, M, Q, W, Y, AR, AS, AT, AY, DL, DT, EA, EK, FH, FS, HL, HW, IS, KV, LD, LE, LR, LS, LT, NR, NT, QE, QL, QT, RA, RD, RE, RF, RH, RL, RR, RS, RV, SA, SD, SE, SF, SI, SK, SL, SQ, SR, SS, ST, SV, TA, TR, TS, TT, TW, VD, VS, WS, YS, AAE, AYH, DTL, EKR, ISR, NTP, PKS, PRP, PTA, PTQ, REL, RPL, SAA, SAL, SGL, SSE, TGL, WGT, and combinations thereof, (iii) DH is an amino acid sequence selected from the group consisting of all possible reading frames that do not include a stop codon encoded by IGHD1-1, IGHD1-20, IGHD1-26, IGHD1-7, IGHD2-15, IGHD2-2, IGHD2-21, IGHD2-8, IGHD3-10, IGHD3-16, IGHD3-22, IGHD3-3, IGHD3-9, IGHD4-17, IGHD4-23, IGHD4-4, IGHD-4-11, IGHD5-12, IGHD5-24, IGHD5-5, IGHD-5-18, IGHD6-13, IGHD6-19, IGHD6-25, IGHD6-6, and IGHD7-27, and N- and C-terminal truncations thereof, (iv) N2 is an amino acid sequence selected from the group consisting of G, P, R, A, S, L, T, V, GG, GP, GR, GA, GS, GL, GT, GV, PG, RG, AG, SG, LG, TG, VG, PP, PR, PA, PS, PL, PT, PV, RP, AP, SP, LP, TP, VP, GGG, GPG, GRG, GAG, GSG, GLG, GTG, GVG, PGG, RGG, AGG, SGG, LGG, TGG, VGG, GGP, GGR, GGA, GGS, GGL, GGT, GGV, D, E, F, H, I, K, M, Q, W, Y, AR, AS, AT, AY, DL, DT, EA, EK, FH, FS, HL, HW, IS, KV, LD, LE, LR, LS, LT, NR, NT, QE, QL, QT, RA, RD, RE, RF, RH, RL, RR, RS, RV, SA, SD, SE, SF, SI, SK, SL, SQ, SR, SS, ST, SV, TA, TR, TS, TT, TW, VD, VS, WS, YS, AAE, AYH, DTL, EKR, ISR, NTP, PKS, PRP, PTA, PTQ, REL, RPL, SAA, SAL, SGL, SSE, TGL, WGT, and combinations thereof, and (v) H3-JH is an amino acid sequence selected from the group consisting of AEYFQH, EYFQH, YFQH, FQH, QH, H, YWYFDL, WYFDL, YFDL, FDL, DL, L, AFDV, FDV, DV, V, YFDY, FDY, DY, Y, NWFDS, WFDS, FDS, DS, S, YYYYYGMDV, YYYYGMDV, YYYGMDV, YYGMDV, YGMDV, GMDV, and MDV, or a sequence of at least 80% identity to any of them. - View Dependent Claims (29)
-
-
30. A library of synthetic polynucleotides, wherein said polynucleotides encode one or more antibody heavy chain amino acid sequences, and wherein the unique CDRH3 amino acid sequences of the heavy chain comprise:
-
(i) an N1 amino acid sequence of 0 to about 3 amino acids, wherein each amino acid of the N1 amino acid sequence is among the 12 most frequently occurring amino acids at the corresponding position in N1 amino acid sequences of CDRH3 amino acid sequences that are functionally expressed by human B cells; (ii) a human CDRH3 DH amino acid sequence, N- and C-terminal truncations thereof, or a sequence of at least about 80% identity to any of them; (iii) an N2 amino acid sequence of 0 to about 3 amino acids, wherein each amino acid of the N2 amino acid sequence is among the 12 most frequently occurring amino acids at the corresponding position in N2 amino acid sequences of CDRH3 amino acid sequences that are functionally expressed by human B cells; and (iv) a human CDRH3 H3-JH amino acid sequence, N-terminal truncations thereof, or a sequence of at least about 80% identity to any of them. - View Dependent Claims (31)
-
- 32. A library of synthetic polynucleotides, wherein said polynucleotides encode a plurality of antibody VKCDR3 amino acid sequences comprising about 1 to about 10 of the amino acids found at Kabat positions 89, 90, 91, 92, 93, 94, 95, 95A, 96, and 97, in selected VKCDR3 amino acid sequences derived from a particular IGKV or IGKJ germline sequence.
-
35. A library of synthetic polynucleotides, wherein said polynucleotides encode a plurality of unique antibody VKCDR3 amino acid sequences that are of at least about 80% identity to an amino acid sequence represented by the following formula:
-
[VK_Chassis]-[L3-VK]-[X]-[JK*], wherein;(i) VK_Chassis is an amino acid sequence selected from the group consisting of about Kabat amino acid 1 to about Kabat amino acid 88 encoded by IGKV1-05, IGKV1-06, IGKV1-08, IGKV1-09, IGKV1-12, IGKV1-13, IGKV1-16, IGKV1-17, IGKV1-27, IGKV1-33, IGKV1-37, IGKV1-39, IGKV1D-16, IGKV1D-17, IGKV1D-43, IGKV1D-8, IGKV2-24, IGKV2-28, IGKV2-29, IGKV2-30, IGKV2-40, IGKV2D-26, IGKV2D-29, IGKV2D-30, IGKV3-11, IGKV3-15, IGKV3-20, IGKV3D-07, IGKV3D-11, IGKV3D-20, IGKV4-1, IGKV5-2, IGKV6-21, and IGKV6D-41, or a sequence of at least about 80% identity to any of them; (ii) L3-VK is the portion of the VKCDR3 encoded by the IGKV gene segment; and (iii) X is any amino acid residue; and (iv) JK* is an amino acid sequence selected from the group consisting of amino acid sequences encoded by IGJK1, IGJK2, IGJK3, IGJK4, and IGJK5, wherein the first amino acid residue of each amino acid sequence is not present. - View Dependent Claims (36, 37)
-
-
38. A library of synthetic polynucleotides, wherein said polynucleotides encode a plurality of Vλ
- CDR3 amino acid sequences that are of at least about 80% identity to an amino acid sequence represented by the following formula;
[Vλ
_Chassis]-[L3-Vλ
]-[Jλ
], wherein;(i) Vλ
_Chassis is an amino acid sequence selected from the group consisting of about Kabat amino acid 1 to about Kabat amino acid 88 encoded by IG/V1-36, IG/V1-40, IG/V1-44, IG/V1-47, IG/V1-51, IGλ
V10-54, IGλ
V2-11, IGλ
V2-14, IGλ
V2-18, IGλ
V2-23, IGλ
V2-8, IGλ
V3-1, IGλ
V3-10, IGλ
V3-12, IGλ
V3-16, IGλ
V3-19, IGλ
V3-21, IGλ
V3-25, IGλ
V3-27, IGλ
V3-9, IGλ
V4-3, IGλ
V4-60, IGλ
V4-69, IGλ
V5-39, IGλ
V5-45, IGλ
V6-57, IGλ
V7-43, IGλ
V7-46, IGλ
V8-61, IGλ
V9-49, and IGλ
V10-54, or a sequence of at least about 80% identity to any of them;(ii) L3-Vλ
is the portion of the Vλ
CDR3 encoded by the IGλ
V segment; and(iii) JR is an amino acid sequence selected from the group consisting of amino acid sequences encoded by IGλ
J1-01, IGλ
J2-01, IGλ
J3-01, IGλ
J3-02, IGλ
J6-01, IGλ
J7-01, and IGλ
J7-02, and wherein the first amino acid residue of each sequence may or may not be deleted. - View Dependent Claims (39)
- CDR3 amino acid sequences that are of at least about 80% identity to an amino acid sequence represented by the following formula;
-
48. A library of synthetic polynucleotides encoding a plurality of antibody CDRH3 amino acid sequences, wherein the percent occurrence within the central loop of the CDRH3 amino acid sequences of at least one of the following i-i+1 pairs in the library is within the ranges specified below:
-
Tyr-Tyr in an amount from about 2.5% to about 6.5%; Ser-Gly in an amount from about 2.5% to about 4.5%; Ser-Ser in an amount from about 2% to about 4%; Gly-Ser in an amount from about 1.5% to about 4%; Tyr-Ser in an amount from about 0.75% to about 2%; Tyr-Gly in an amount from about 0.75% to about 2%; and Ser-Tyr in an amount from about 0.75% to about 2%. - View Dependent Claims (49, 50, 51)
-
-
52. A library of synthetic polynucleotides encoding a plurality of antibody CDRH3 amino acid sequences, wherein the percent occurrence within the central loop of the CDRH3 amino acid sequences of at least one of the following i-i+2 pairs in the library is within the ranges specified below:
-
Tyr-Tyr in an amount from about 2.5% to about 4.5%; Gly-Tyr in an amount from about 2.5% to about 5.5%; Ser-Tyr in an amount from about 2% to about 4%; Tyr-Ser in an amount from about 1.75% to about 3.75%; Ser-Gly in an amount from about 2% to about 3.5%; Ser-Ser in an amount from about 1.5% to about 3%; Gly-Ser in an amount from about 1.5% to about 3%; and Tyr-Gly in an amount from about 1% to about 2%. - View Dependent Claims (53, 54, 55)
-
-
56. A library of synthetic polynucleotides encoding a plurality of antibody CDRH3 amino acid sequences, wherein the percent occurrence within the central loop of the CDRH3 amino acid sequences of at least one of the following i-i+3 pairs in the library is within the ranges specified below:
-
Gly-Tyr in an amount from about 2.5% to about 6.5%; Ser-Tyr in an amount from about 1% to about 5%; Tyr-Ser in an amount from about 2% to about 4%; Ser-Ser in an amount from about 1% to about 3%; Gly-Ser in an amount from about 2% to about 5%; and Tyr-Tyr in an amount from about 0.75% to about 2%. - View Dependent Claims (57, 58, 59)
-
-
61. A method of preparing a library of synthetic polynucleotides encoding a plurality of antibody CDRH3 amino acid sequences, the method comprising:
-
(i) providing polynucleotide sequences encoding; (a) one or more N1 amino acid sequences of 0 to about 3 amino acids, wherein each amino acid of the N1 amino acid sequence is among the 12 most frequently occurring amino acids at the corresponding position in N1 amino acid sequences of CDRH3 amino acid sequences that are functionally expressed by human B cells; (b) one or more human CDRH3 DH amino acid sequences, N- and C-terminal truncations thereof, or a sequence of at least about 80% identity to any of them; (c) one or more N2 amino acid sequences of 0 to about 3 amino acids, wherein each amino acid of the N1 amino acid sequence is among the 12 most frequently occurring amino acids at the corresponding position in N2 amino acid sequences of CDRH3 amino acid sequences that are functionally expressed by human B cells; and (d) one or more human CDRH3 H3-JH amino acid sequences, N-terminal truncations thereof, or a sequence of at least about 80% identity to any of them; and (ii) assembling the polynucleotide sequences to produce a library of synthetic polynucleotides encoding a plurality of human antibody CDRH3 amino acid sequences represented by the following formula;
[N1]-[DH]-[N2]-[H3-JH]. - View Dependent Claims (62, 63, 64, 65, 66)
-
Specification