Comparative gene transcript analysis

US 6,114,114 A
Filed: 07/29/1994
Issued: 09/05/2000
Est. Priority Date: 07/17/1992
Status: Expired due to Term

First Claim

Patent Images

1. A method of quantifying relative abundance of mRNA in a biological sample, said method comprising the steps of:

(a) isolating an mRNA population from the biological sample;

(b) identifying gene transcripts by a sequence-specific method, which method comprises(i) making cDNA copies of the mRNA; and

(ii) isolating a population of the cDNA copies and producing therefrom a first cDNA library, wherein a selected set of random primers was used in the generation of the first cDNA library;

(c) determining a number of gene transcripts in the mRNA population that encode the same gene product;

(d) processing in a programmed computer the number of gene transcripts that encode the same gene product to calculate a relative abundance of the transcripts within the population of gene transcripts, wherein said relative abundance is calculated by tabulating the number of gene transcripts that encode the same gene product to generate an abundance number and dividing the abundance number by the total number of gene transcripts in the mRNA population to obtain a calculated relative abundance number for each identified gene transcript; and

(e) processing the calculated relative abundance of each gene transcript to generate a gene transcript image of the biological sample;

wherein the gene transcript image provides a calculated relative abundance that is quantified for each gene transcript.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and system for quantifying the relative abundance of gene transcripts in a biological sample. One embodiment of the method generates high-throughput sequence-specific analysis of multiple RNAs or their corresponding cDNAs (gene transcript imaging analysis). Another embodiment of the method produces a gene transcript imaging analysis by the use of high-throughput CDNA sequence analysis. In addition, the gene transcript imaging can be used to detect or diagnose a particular biological state, disease, or condition which is correlated to the relative abundance of gene transcripts in a given cell or population of cells. The invention provides a method for comparing the gene transcript image analysis from two or more different biological samples in order to distinguish between the two samples and identify one or more genes which are differentially expressed between the two samples.

59 Citations

17 Claims

1. A method of quantifying relative abundance of mRNA in a biological sample, said method comprising the steps of:
- (a) isolating an mRNA population from the biological sample;
  
  (b) identifying gene transcripts by a sequence-specific method, which method comprises(i) making cDNA copies of the mRNA; and
  
  (ii) isolating a population of the cDNA copies and producing therefrom a first cDNA library, wherein a selected set of random primers was used in the generation of the first cDNA library;
  
  (c) determining a number of gene transcripts in the mRNA population that encode the same gene product;
  
  (d) processing in a programmed computer the number of gene transcripts that encode the same gene product to calculate a relative abundance of the transcripts within the population of gene transcripts, wherein said relative abundance is calculated by tabulating the number of gene transcripts that encode the same gene product to generate an abundance number and dividing the abundance number by the total number of gene transcripts in the mRNA population to obtain a calculated relative abundance number for each identified gene transcript; and
  
  (e) processing the calculated relative abundance of each gene transcript to generate a gene transcript image of the biological sample;
  
  wherein the gene transcript image provides a calculated relative abundance that is quantified for each gene transcript.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The method of claim 1, further comprising:
    - (f) repeating steps (a) through (e) on a sample from a normal human tissue and on a sample from a diseased human tissue to produce a first set of reference gene transcript images from the normal human tissue and a second set of reference gene transcript images from the diseased human tissue;
      
      (g) storing the first and second sets of reference gene transcript images in the programmed computer; and
      
      (h) comparing the gene transcript image produced in step (e) of claim 1 with the first and second sets of reference gene transcript images to identify at least one of the reference gene transcript images which closely approximates that of the gene transcript image of the biological sample.
  - 3. The method of claim 2, wherein the biological sample is a biopsy, sputum, blood or urine sample.
  - 4. The method of claim 1, wherein the isolated mRNA population comprises at least 5,000 mRNA molecules.
  - 5. The method of claim 1, wherein the isolated mRNA population comprises at least 100,000 mRNA molecules.

6. A method of producing a gene transcript image analysis, said method comprising the steps of:
- (a) obtaining a mixture of mRNA;
  
  (b) making cDNA copies of the mRNA and isolating a representative population of the cDNA copies, wherein a selected set of random primers is used in the generation of the representative population;
  
  (c) inserting the representative population of cDNA copies into cells thereby producing clones;
  
  (d) isolating a population of clones, wherein the cDNA in the clones in the population is representative of mRNA sequences expressed in a sample;
  
  (e) identifying each clone in the population by a sequence-specific method;
  
  (f) determining the number of times each cDNA is represented within the population of clones;
  
  (g) processing in a programmed computer the number of times each cDNA is represented to calculate a relative abundance of expression of each mRNA; and
  
  (h) processing the relative abundance of expression of each mRNA to produce a gene transcript image for the population of clones, wherein said relative abundance is calculated by tabulating the number of mRNA transcripts that encode the same gene product to generate a set of abundance numbers and dividing each abundance number by a total number of MRNA transcripts in the mRNA population to obtain a calculated relative abundance number for each identified gene transcript;
  
  wherein the gene transcript image provides a calculated relative abundance that is quantified for each gene transcript.
- View Dependent Claims (7, 8, 9)
- - 7. The method of claim 6, also including the step of diagnosing disease by:
    - repeating steps (a) through (h) on a normal sample from a normal human tissue and on a diseases sample from a diseased human tissue to produce a normal reference gene transcript image analysis from the normal human tissue and a diseased reference gene-transcript image analysis from the diseased human tissue;
      
      storing said normal reference gene transcript image analysis and diseased reference gene transcript image analysis in a programmed computer;
      
      obtaining a patient sample from a human patient, and producing a gene transcript image analysis by preforming steps (a) through (h) from the patient sample; and
      
      processing the transcript image analysis of the patient sample in the programmed computer to identify at least one of reference transcript image analysis which closely approximates the patient sample.
  - 8. The method of claim 6, wherein at least 5,000 cDNA clones are processed to calculate a relative abundance of expression of each gene.
  - 9. The method of claim 6, wherein at least 100,000 cDNA clones are processed to calculate a relative abundance of expression of each gene.

10. A computer system for quantifying the relative abundance of identified sequences in a library of nucleic acid or amino acid biological sequences, said system comprising:
- means for receiving and storing a set of said biological sequences, where each of the biological sequences is indicative of a different one of the biological sequences of a library of biological sequences prepared from a biological sample;
  
  processing means for calculating an identified sequence value for each biological sequence in the set of biological sequences, where each said identified sequence value is indicative of a degree of match between a biological sequence of the library and at least one biological sequence of a reference library of biological sequences;
  
  means for processing each said identified sequence value to calculate final data values indicative of a number of matches between the corresponding biological sequence and at least one biological sequence of the reference library;
  
  processing means for calculating a relative abundance of identified sequence values corresponding to the set of biological sequences, wherein said relative abundance is calculated by tabulating the number of identified sequence values corresponding to a selected set of identified sequences to generate a set of abundance numbers and dividing each abundance number in the set by a total number of biological sequences in the set of biological sequences to obtain a calculated relative abundance number for each identified sequence value;
  
  processing means for generating a gene transcript image of the biological sample by calculating the relative abundance of each identified sequence value; and
  
  means for displaying an abundance sort representing the biological sequences present in the library.
- View Dependent Claims (11, 12, 13)
- - 11. The system of claim 10, wherein the biological sequences are cDNA, RNA or amino acid sequences.
  - 12. The computer system of claim 10, wherein the library of biological sequences received and stored by the system comprises at least 5000 biological sequences.
  - 13. The computer system of claim 10, wherein the library of biological sequences received and stored by the system comprises at least 100,000 biological sequences.

14. A computer system for performing analysis to determine the abundance of nucleic acid or amino acid biological sequences in a first library of biological sequences relative to a second library of biological sequences, said system comprising:
- means for receiving an storing a first set of biological sequences, where each of the biological sequences is indicative of a different one of the biological sequences of a first library of biological sequences;
  
  means for receiving and storing a second set of biological sequences, where each of the biological sequences is indicative of a different one of the biological sequences of a second library of biological sequences;
  
  processing means for calculating a first set of identified sequence values corresponding to the first set of biological sequences and a second set of identified sequence values corresponding to the second set of biological sequences, wherein each identified sequence value is indicative of a degree of match between a biological sequence of the corresponding first or second sets of biological sequences and at least one biological sequence of a reference library of biological sequences;
  
  means for processing each identified sequence value of said first and second sets of identified sequence values to calculate a first set of final data values and a second set of final data values, wherein each final data value is indicative of a number of matches between biological sequences of the corresponding first or second sets of biological sequences and at least one biological sequence of the reference library;
  
  processing means for calculating a first set of relative abundance numbers, wherein said first set of relative abundance is calculate by tabulating the number of identified sequences of a selected set of identified sequences corresponding to identified sequence values within the first set of identified sequence values to generate a first set of abundance numbers, and dividing each abundance number of the first set of abundance numbers by a total number of biological sequences in the first set of biological sequences to obtain a first set of calculated relative abundance numbers for each identified sequence value of the first set of identified sequence values;
  
  processing means for calculating a second set of relative abundance numbers, wherein said second set of relative abundance is calculated by tabulating the number of identified sequences of a selected set of identified sequences corresponding to identified sequence values within the second set of identified sequence values to generate a second set of abundance numbers, and dividing each abundance number of the second set of abundance numbers by a total number of biological sequences in the second set of biological sequences to obtain a second set of calculated relative abundance numbers for each identified sequence value of the second set of identified sequence values;
  
  processing means for identifying pairs of corresponding relative abundance numbers in the first and second sets of relative abundance numbers;
  
  processing means for generating a ratio value for each identified pair of corresponding relative abundance numbers, wherein the ratio value is calculated by dividing the first relative abundance number of the identified pair by the second relative abundance number of the identified pair; and
  
  means for sorting and displaying a list of ratio values;
  
  wherein the list of ratio values represents the abundance of biological sequences in the first set of biological sequences relative to the second set of biological sequences.
- View Dependent Claims (15, 16, 17)
- - 15. The system of claim 14, wherein the biological sequences are cDNA, RNA or amino acid sequences.
  - 16. The computer system of claim 14, wherein each of the first and second libraries of biological sequences comprises at least 5,000 biological sequences.
  - 17. The computer system of claim 14, wherein each of the first and second libraries of biological sequences comprises at least 100,000 biological sequences.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Incyte Pharmaceuticals, Inc.
Original Assignee
Incyte Pharmaceuticals, Inc.
Inventors
Seilhamer, Jeffrey J., Scott, Randal W.
Primary Examiner(s)
Martinell, James

Application Number

US08/282,955
Time in Patent Office

2,230 Days
Field of Search

435/6, 364/413.02
US Class Current

435/6.14
CPC Class Codes

C12Q 1/68   involving nucleic acids

C12Q 1/6809   Methods for determination o...

G16B 20/00   ICT specially adapted for f...

G16B 30/00   ICT specially adapted for s...

G16B 30/10   Sequence alignment; Homolog...

G16B 30/20   Sequence assembly

G16B 35/00   ICT specially adapted for i...

G16B 35/10   Design of libraries

G16B 35/20   Screening of libraries

G16B 50/00   ICT programming tools or da...

G16C 20/60   In silico combinatorial che...

Comparative gene transcript analysis

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

59 Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Comparative gene transcript analysis

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

59 Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links