×

Techniques for facilitating identification of candidate genes

  • US 6,470,277 B1
  • Filed: 07/28/2000
  • Issued: 10/22/2002
  • Est. Priority Date: 07/30/1999
  • Status: Expired due to Fees
First Claim
Patent Images

1. In a computer system, a method of identifying candidate genes from a plurality of DNA sequences, the method comprising:

  • obtaining results of a homology search for a first plurality of DNA sequences, the homology search results comprising information about homologs of the first plurality of DNA sequences;

    obtaining annotative information for the first plurality of DNA sequences, the annotative information comprising information about biochemical functions and physiological roles of the first plurality of DNA sequences, wherein obtaining the annotative information comprises;

    identifying one or more known genes from the first plurality of DNA sequences based on the homology search results, wherein a DNA sequence from the first plurality of DNA sequences is identified as a known gene if a sequence identity of the DNA sequence to a sequence stored in a first database of sequences used for the homology search is at least equal to a first threshold value;

    accessing one or more information sources storing annotative information for DNA sequence;

    extracting annotative information from the one or more information sources for the known genes, the extracted annotative information comprising information about one or more biochemical functions and physiological roles of each known gene; and

    assigning a reference score to the extracted annotative information for each known gene based on the level of acceptance of the roles or functions of the known gene as described by the annotative information such that annotative information with a high level of acceptance is assigned a higher reference score than annotative information with a low level of acceptance;

    obtaining gene expression profile data for the first plurality of DNA sequences, the gene expression profile data describing behavioral patterns of the first plurality of DNA sequences;

    clustering the first plurality of DNA sequences based on the behavioral patterns of the first plurality of DNA sequences as described by the gene expression profile data;

    storing the results of the homology search, the annotative information, the reference score assigned to the extracted annotative information for each known gene, the gene expression profile data, and results from clustering the first plurality of DNA sequences in a database;

    receiving a query identifying criteria for the candidate genes; and

    searching the database, in response to the query, to identify a set of DNA sequences from the first plurality of DNA sequences which satisfy the query criteria.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×