Databases of regulatory sequences; methods of making and using same
First Claim
1. A method for isolating a collection of polynucleotide sequences corresponding to accessible regions of cellular chromatin, the method comprising:
- (a) treating cellular chromatin with a probe, wherein the probe reacts with accessible regions of cellular chromatin;
(b) fragmenting the treated chromatin to produce unmarked polynucleotides and marked polynucleotides, wherein each marked polynucleotide comprises one or more sites that have reacted with the probe; and
(c) isolating the marked polynucleotides.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods and compositions for the identification, isolation and characterization of regulatory DNA sequences in a cell of interest are provided. Also provided are libraries of regulatory sequences obtained according to the methods, and databases comprising collections of regulatory sequences for a particular cell of interest. In addition, various uses for the regulatory sequences so obtained, and uses for the databases of regulatory sequences, are provided. Also disclosed are computer systems and computer program products for utilizing the databases to conduct various genetic analyses, and uses of accessible regulatory sequences in the design of vectors bearing transgenes.
32 Citations
122 Claims
-
1. A method for isolating a collection of polynucleotide sequences corresponding to accessible regions of cellular chromatin, the method comprising:
-
(a) treating cellular chromatin with a probe, wherein the probe reacts with accessible regions of cellular chromatin;
(b) fragmenting the treated chromatin to produce unmarked polynucleotides and marked polynucleotides, wherein each marked polynucleotide comprises one or more sites that have reacted with the probe; and
(c) isolating the marked polynucleotides. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 17, 18, 19, 20, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97)
-
-
16. A method for isolating a collection of polynucleotides corresponding to accessible regions of cellular chromatin, the method comprising:
-
(a) treating cellular chromatin with a methylase to generate methylated chromatin;
(b) deproteinizing the methylated chromatin to form deproteinized chromatin;
(c) digesting the deproteinized chromatin with a methylation-dependent restriction enzyme to produce a collection of restriction fragments, wherein the collection comprises methylated polynucleotides and non-methylated polynucleotides; and
(d) isolating a collection of non-methylated polynucleotides;
whereby the termini of the non-methylated polynucleotides correspond to accessible regions of cellular chromatin.
-
-
21. A method for isolating a collection of polynucleotides corresponding to accessible regions of cellular chromatin, the method comprising:
-
(a) treating cellular chromatin with a methylase to generate methylated chromatin;
(b) deproteinizing the methylated chromatin to form deproteinized chromatin;
(c) digesting the deproteinized chromatin with a methylation-dependent restriction enzyme to produce a collection of restriction fragments, wherein the collection comprises methylated polynucleotides and non-methylated polynucleotides; and
(d) isolating a collection of methylated polynucleotides, whereby the methylated polynucleotides correspond to accessible regions of cellular chromatin. - View Dependent Claims (22, 23, 24, 25, 26, 28, 29, 30)
-
-
27. A method for isolating a collection of polynucleotides corresponding to accessible regions of cellular chromatin, the method comprising:
-
(a) treating cellular chromatin with a nuclease; and
(b) isolating a collection of polynucleotide fragments released by nuclease treatment;
wherein the released polynucleotide fragments are derived from accessible regions of cellular chromatin.
-
-
31. A method for isolating a collection of polynucleotides corresponding to regulatory regions of cellular chromatin, the method comprising:
-
(a) treating cellular chromatin with a methylation-sensitive enzyme that cleaves at unmethylated CpG sequences; and
(b) isolating a collection of short polynucleotide fragments released by enzyme treatment, wherein the polynucleotide fragments are derived from regulatory regions of cellular chromatin. - View Dependent Claims (32, 33, 34, 35, 36, 37, 39, 40, 41)
-
-
38. A method for isolating a collection of polynucleotides corresponding to regulatory regions of cellular chromatin, the method comprising:
-
(a) selectively cleaving AT-rich sequences of cellular DNA; and
(b) isolating a collection of large polynucleotide fragments released by the treatment;
wherein the large polynucleotide fragments comprise regulatory regions.
-
-
42. A method for isolating a collection of polynucleotides corresponding to regulatory regions of cellular chromatin, the method comprising:
-
(a) selectively cleaving AT-rich sequences in cellular DNA to form a mixture of methylated and unmethylated fragments enriched in CpG islands; and
(b) isolating the unmethylated fragments from the methylated fragments to obtain a collection of unmethylated fragments enriched in CpG islands, wherein the unmethylated fragments are derived from regulatory regions of cellular chromatin. - View Dependent Claims (43, 44, 45, 46, 48, 49, 50, 51, 52, 53)
-
-
47. A method for isolating a collection of polynucleotides corresponding to regulatory regions of cellular chromatin, the method comprising:
-
(a) fragmenting chromatin;
(b) contacting the fragments with an antibody that specifically binds to acetylated histones, thereby forming an immunoprecipitate enriched in polynucleotides corresponding to regulatory regions; and
(d) collecting the polynucleotides from the immunoprecipitate.
-
-
54. A method for mapping accessible regions of cellular chromatin relative to a gene of interest, the method comprising:
-
(a) reacting cellular chromatin with a probe to generate chromatin-associated DNA fragments, wherein the DNA fragments comprise, at their termini, sites of probe reaction which identify accessible regions of cellular chromatin;
(b) attaching an adapter polynucleotide to the termini generated by the probe to generate adapter-ligated fragments; and
(c) amplifying the adapter-ligated fragments in the presence of a first primer that is complementary to the adapter and a second primer that is complementary to a segment of the gene of interest to form one or more amplified products, wherein the size of an amplified product is a measure of the distance between the segment of the gene to which the second primer binds and a terminus generated by the probe;
thereby mapping accessible regions of cellular chromatin relative to the gene of interest. - View Dependent Claims (55, 56, 57, 58, 60, 61, 62, 63, 64, 65)
-
-
59. A method for generating a library of polynucleotides whose sequences correspond to accessible regions of cellular chromatin, the method comprising:
-
(a) reacting cellular chromatin with a probe to generate chromatin-associated DNA fragments, wherein the DNA fragments comprise, at their termini, sites of probe reaction which identify accessible regions of cellular chromatin;
(b) attaching a first adapter polynucleotide to the termini generated by the probe to generate adapter-ligated fragments;
(c) digesting the adapter-ligated fragments with a restriction enzyme to generate a population of digested fragments, wherein the population comprises digested fragments having a first end that comprises the first adapter and a second end formed via the activity of the restriction enzyme;
(d) contacting the digested fragments with a primer complementary to the first adapter under conditions wherein the primer is extended to generate a plurality of extension products, each comprising a first end that comprises the first adapter and a second end that can be attached to a second adapter polynucleotide;
(e) joining the second adapter to the second end of each of the plurality of extension products to form a plurality of modified fragments, each of which comprises the first and second adapters at its first and second end, respectively;
(f) amplifying the plurality of modified fragments in the presence of primers complementary to the sequences of the first and second adapters to generate a population of amplified products comprising sequences corresponding to accessible regions of cellular chromatin; and
(g) inserting the population of amplified products into a selected vector;
thereby generating a library of polynucleotides whose sequences correspond to accessible regions of cellular chromatin.
-
-
78. A method for analyzing polynucleotide sequences, comprising:
-
(a) providing a database comprising a plurality or polynucleotide sequences corresponding to accessible regions of cellular chromatin, wherein the polynucleotide sequences are organized in collections corresponding to accessible regions from different samples of cellular chromatin;
(b) selecting two or more of the collections for comparison;
(c) determining whether one or more of the collections being compared includes a common polynucleotide sequence;
(d) determining whether one or more of the collections being compared includes a unique polynucleotide sequence; and
(e) displaying the common or unique sequences.
-
-
98. A computer system for analyzing polynucleotide sequences, comprising:
-
(a) a memory;
(b) a system bus; and
(c) a processor programmed to;
(i) compare one or more polynucleotide sequences from each of a plurality of collections of polynucleotide sequences, wherein each collection comprises a plurality of polynucleotide sequences corresponding to accessible regions of cellular chromatin, different collections comprising polynucleotide sequences that correspond to accessible regions for different samples of cellular chromatin;
(ii) identify one or more polynucleotides unique or common to at least one of the plurality of collections; and
(iii) display the identified polynucleotide sequence(s).
-
-
99. A computer system, comprising:
-
(a) a database comprising sequence records that include an identifier that identifies one or more projects to which each of the sequence records belong, each of the projects comprising (i) comparing a plurality of polynucleotide sequences from each of a plurality of collections of polynucleotide sequences, wherein each collection comprises a plurality of polynucleotide sequences corresponding to accessible regions of cellular chromatin, different collections comprising polynucleotide sequences that correspond to accessible regions for different samples of cellular chromatin, and (ii) identifying one or more polynucleotide sequences that are unique or common to at least one of the plurality of collections; and
(b) a user interface that permits a user to selectively view information concerning the one or more projects. - View Dependent Claims (100, 101, 102, 103, 104)
-
-
105. A computer system, comprising:
-
(a) a database comprising sequence records that include an identifier that identifies one or more projects to which each of the sequence records belong, each of the projects comprising (i) comparing a plurality of polynucleotide sequences from each of a plurality of collections of polynucleotide sequences, wherein each collection comprises a plurality of polynucleotide sequences corresponding to accessible regions of cellular chromatin, different collections comprising polynucleotide sequences that correspond to accessible regions for different samples of cellular chromatin, and (ii) identifying one or more polynucleotide sequences unique or common to at least some of the plurality of collections; and
(b) a user interface that (i) permits a user to input identifying information specifying which of the polynucleotide sequences of the plurality of collections are to be compared; and
(ii) displays the identified polynucleotide(s).
-
-
106. A computer system for analyzing polynucleotide sequences, comprising:
-
(a) a memory;
(b) a system bus; and
(c) a processor operatively disposed to (i) compare a collection of polynucleotide sequences corresponding to accessible regions of cellular chromatin in a sample with one or more known sequences to assess sequence similarity between one or more of the polynucleotide sequences within the collection and the one or more known sequences; and
(ii) display information concerning the sequence similarity between the one or more of the polynucleotide sequences within the collection and the one or more known sequences.
-
-
107. A computer system, comprising:
-
(a) a database comprising sequence records that include an identifier that identifies one or more projects to which each of the sequence records belong, each of the projects comprising comparing a collection of polynucleotide sequences corresponding to accessible regions of cellular chromatin in a sample with one or more known sequences to assess sequence similarity between one or more polynucleotide sequences within the collection and the one or more known sequences; and
(b) a user interface that permits a user to selectively view information concerning the one or more projects. - View Dependent Claims (108, 109)
-
-
110. A computer system, comprising:
-
(a) a database comprising sequence records that include an identifier that identifies one or more projects to which each of the sequence records belong, each of the projects comprising comparing a collection of polynucleotide sequences corresponding to accessible regions of cellular chromatin in a sample with one or more known sequences to assess sequence similarity between one or more polynucleotide sequences within the collection and the one or more known sequences; and
(b) a user interface that (i) permits a user to input identifying information specifying which of the polynucleotide sequences within the collections are to be compared; and
(ii) displays information regarding sequence similarity between the one or more polynucleotides sequences and the one or more known sequences.
-
-
111. A computer-readable medium comprising program instructions for analyzing polynucleotide sequences by performing the following:
-
(a) providing or receiving a plurality of collections of polynucleotide sequences, each collection comprising a plurality of polynucleotide sequences corresponding to accessible regions of cellular chromatin, different collections comprising accessible regions for different samples of cellular chromatin;
(b) identifying one or more polynucleotide sequences that are unique to at least one of the plurality of collections;
(c) identifying one or more polynucleotide sequences that are common to two or more of the plurality of collections; and
(d) displaying information concerning any identified polynucleotide sequence(s). - View Dependent Claims (112, 113, 114, 115, 116, 117, 122)
-
-
118. A computer-readable medium comprising program instructions for:
-
(a) determining sequence similarity between a database of polynucleotide sequences that correspond to accessible regions of cellular chromatin in a sample and one or more known sequences; and
(b) displaying information concerning the sequence similarity as determined in step (a).
-
-
119. A computer program product comprising a computer-useable medium and computer-readable program code encoded within the computer-useable medium, wherein the computer-readable program code
(a) comprises a database having a plurality of sequence records, wherein one or more of the sequence records include an identifier assigning that sequence record to one or more projects, wherein each project is based on determining whether the sequence record includes common and unique polynucleotide sequences corresponding to accessible regions of cellular chromatin; - and
(b) effects the following steps with a computer system (i) providing an interface that permits a user to query one or more projects;
(ii) locating sequence data corresponding to the query; and
(iii) displaying the sequence data corresponding to the query.
- and
-
120. A computer program product comprising a computer-useable medium and computer-readable program code encoded within the computer-useable medium, wherein the computer-readable program code
(a) comprises a database comprising a plurality of collections of polynucleotide sequences corresponding to accessible regions of cellular chromatin, different collections comprising accessible regions for different samples of cellular chromatin; - and
(b) effects the following steps with a computer system (i) identifying sequences that are unique between collections selected by a user;
(ii) identifying sequences that are common between collections selected by a user; and
(iii) displaying common and unique sequences.
- and
-
121. A computer program product comprising a computer-useable medium and computer-readable program code encoded within the computer-useable medium, wherein the computer-readable program code
(a) comprises a database comprising a collection of polynucleotide sequences corresponding to accessible regions of cellular chromatin in a chromatin sample; - and
(b) effects the following steps with a computer system (i) determining sequence similarity between two or more polynucleotide sequences selected by a user as compared to one or more known sequences; and
(ii) displaying the sequence similarity between the selected polynucleotides and known sequences.
- and
Specification