Method of determining clonotypes and clonotype profiles
First Claim
1. A method of determining clonotypes of an immune repertoire, said method comprising the steps:
- obtaining from a subject a sample comprising T-cells and/or B-cells;
spatially isolating on a solid surface individual molecules of recombined nucleic acids encoding T cell receptor molecules or immunoglobulin molecules from said T-cells and/or B-cells of said sample;
sequencing by synthesis using reversibly terminated labeled nucleotides said spatially isolated individual molecules of recombined nucleic acids to generate a set of sequence reads, said molecules of recombined nucleic acids each having a V region, an NDN region and a J region, wherein clonotypes are formed from said set of sequence reads by;
(a) constructing from sequence reads encompassing at least a portion of an NDN region a sequence tree having leaves representing candidate clonotypes, each leaf and its corresponding candidate clonotype having a frequency;
(b) coalescing with a highest frequency candidate clonotype any lesser frequency candidate clonotype whenever a lesser frequency of said lesser frequency candidate clonotype is below a predetermined frequency value and a sequence difference therebetween is below a predetermined difference value to form a clonotype having a sequence of said highest frequency candidate clonotype and having associated sequence reads summed from said highest frequency candidate clonotype and said lesser frequency candidate clonotype;
(c) removing leaves corresponding to said coalesced candidate clonotypes from said sequence tree; and
(d) repeating steps (b) and (c) until a highest frequency of a lesser frequency candidate clonotype is below a predetermined stopping value, thereby determining clonotypes from said sample.
3 Assignments
0 Petitions
Accused Products
Abstract
The invention is directed to methods for determining clonotypes and clonotype profiles in assays for analyzing immune repertoires by high throughput nucleic acid sequencing of somatically recombined immune molecules. In one aspect, the invention comprises generating a clonotype profile from an individual by generating sequence reads from a sample of recombined immune molecules; forming from the sequence reads a sequence tree representing candidate clonotypes each having a frequency; coalescing with a highest frequency candidate clonotype any lesser frequency candidate clonotypes whenever such lesser frequency is below a predetermined value and whenever a sequence difference therebetween is below a predetermined value to form a clonotype. After such coalescence, the candidate clonotypes is removed from the sequence tree and the process is repeated. This approach permits rapid and efficient differentiation of candidate clonotypes with genuine sequence differences from those with experimental or measurement errors, such as sequencing errors.
144 Citations
45 Claims
-
1. A method of determining clonotypes of an immune repertoire, said method comprising the steps:
-
obtaining from a subject a sample comprising T-cells and/or B-cells; spatially isolating on a solid surface individual molecules of recombined nucleic acids encoding T cell receptor molecules or immunoglobulin molecules from said T-cells and/or B-cells of said sample; sequencing by synthesis using reversibly terminated labeled nucleotides said spatially isolated individual molecules of recombined nucleic acids to generate a set of sequence reads, said molecules of recombined nucleic acids each having a V region, an NDN region and a J region, wherein clonotypes are formed from said set of sequence reads by; (a) constructing from sequence reads encompassing at least a portion of an NDN region a sequence tree having leaves representing candidate clonotypes, each leaf and its corresponding candidate clonotype having a frequency; (b) coalescing with a highest frequency candidate clonotype any lesser frequency candidate clonotype whenever a lesser frequency of said lesser frequency candidate clonotype is below a predetermined frequency value and a sequence difference therebetween is below a predetermined difference value to form a clonotype having a sequence of said highest frequency candidate clonotype and having associated sequence reads summed from said highest frequency candidate clonotype and said lesser frequency candidate clonotype; (c) removing leaves corresponding to said coalesced candidate clonotypes from said sequence tree; and (d) repeating steps (b) and (c) until a highest frequency of a lesser frequency candidate clonotype is below a predetermined stopping value, thereby determining clonotypes from said sample. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 30, 31, 32, 33, 34, 35)
-
-
16. A method of determining clonotypes of an immune repertoire, said method comprising the steps:
-
obtaining from a subject a sample comprising T-cells and/or B-cells; spatially isolating on a solid surface individual molecules of recombined nucleic acids encoding T cell receptor molecules or immunoglobulin molecules from said T-cells and/or B-cells; sequencing by synthesis using reversibly terminated labeled nucleotides said spatially isolated individual molecules of recombined nucleic acids to generate a set of sequence reads, said molecules of recombined nucleic acids each having portions of a V region, an NDN region and a J region wherein clonotypes are formed from said set of sequence reads by; (a) constructing from sequence reads encompassing portions of NDN regions a sequence tree having leaves representing candidate clonotypes, each leaf and its corresponding candidate clonotype having a frequency; (b) selecting a highest frequency candidate clonotype and identifying all lesser frequency candidate clonotype having a sequence difference therewith less than a predetermined difference value to form a coalescence subset; (c) coalescing with said highest frequency candidate clonotype any lesser frequency candidate clonotype in said coalescence subset whenever a lesser frequency of said lesser frequency candidate clonotype is below a predetermined frequency value to form a clonotype having a sequence of said highest frequency candidate clonotype and having associated sequence reads summed from said highest frequency candidate clonotype and said lesser frequency candidate clonotype; (d) removing leaves corresponding to said coalesced candidate clonotypes from said sequence tree; and (e) repeating steps (b) through (d) until clonotypes have been formed from all non-singleton lesser frequency candidate clonotypes, thereby determining clonotypes from said sample. - View Dependent Claims (17, 18, 19, 20, 21, 36, 37, 38, 39, 40)
-
-
22. A method of generating a clonotype profile from an individual, said method comprising the steps of:
-
(a) spatially isolating on a solid surface individual molecules of recombined nucleic acids from a sample containing T-cells and/or B-cells of said individual, wherein said recombined nucleic acids encode T cell receptor molecules or immunoglobulin molecules; (b) sequencing by synthesis using reversibly terminated labeled nucleotides said spatially isolated individual molecules to produce a plurality of sequence reads each having portions of a V region, an NDN region and a J region and forming from sequence reads encompassing an NDN region a sequence tree having leaves representing candidate clonotypes, each leaf and its corresponding candidate clonotype having a frequency; (c) coalescing with a highest frequency candidate clonotype any lesser frequency candidate clonotype whenever a lesser frequency of said lesser frequency candidate clonotype is below a predetermined frequency value and a sequence difference therebetween is below a predetermined difference value to form a clonotype having a sequence of said highest frequency candidate clonotype and having associated sequence reads summed from said highest frequency candidate clonotype and said lesser frequency candidate clonotype and wherein such coalesced lesser frequency candidate clonotype is thereafter disregarded; and (d) repeating step (c) until clonotypes have been formed from all non-singleton lesser frequency candidate clonotypes, thereby generating said clonotype profile. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29, 41, 42, 43, 44, 45)
-
Specification