×

System and method for determining matching patterns within gene expression data

  • US 7,428,554 B1
  • Filed: 05/20/2004
  • Issued: 09/23/2008
  • Est. Priority Date: 05/23/2000
  • Status: Expired due to Fees
First Claim
Patent Images

1. In a computer network, a method for determining patterns within gene expression data stored in a database containing biological data, the method comprising:

  • (a) defining a plurality of sample nodes within the database, each sample node comprising a curated data set comprising a set of pre-formatted and pre-computed biological data obtained from at least one biological sample, wherein the plurality of sample nodes are organized in a hierarchical arrangement according to clinical relevance;

    (b) assigning a set of clinical attributes to each sample node, the set of clinical attributes including at least one taxonomy designation selected from the group consisting of tissues, diseases, medications and sample parameters;

    (c) providing a user interface for entry of a search query into the computer processor and displaying search results at a user interface;

    (d) prompting entry of the search query by requesting user selection of a search category from the group consisting of biological materials, biological material family, biological pathways, and sample set taxonomy, and wherein each sample node of the plurality of sample nodes is associated with a plurality of search categories;

    (e) searching the plurality of sample nodes for data responsive to the search query;

    (f) selecting one or more sample nodes containing the data responsive to the search query;

    (g) saving search results comprising the set of pre-formatted and pre-computed biological data responsive to the one or more selected sample nodes;

    (h) receiving a user interface selection of an algorithm for performing gene expression pattern matching for identifying genes or gene fragments within the one or more selected sample nodes that have similar gene expression patterns to a gene of interest, the algorithm comprising;

    (i) computing a plurality of pairwise comparisons between the gene of interest and the genes or gene fragments within the one or more sample nodes, wherein each comparison is encoded using a qualitative three-state encoding scheme, wherein up-regulation of gene expression in the gene of interest relative to the genes or gene fragments within the one or more sample nodes is assigned a first symbol, down-regulation of gene expression in the gene of interest relative to the genes or gene fragments within the one or more sample nodes is assigned a second symbol different from the first symbol and no change in gene expression in the gene of interest relative to the genes or gene fragments within the one or more sample nodes is assigned a third symbol different from the first and second symbols wherein the three-state encoding scheme comprises a non-quantitative indication of gene behavior;

    (ii) generating a three-by-three contingency matrix for each pairwise comparison using the three-state encoding scheme;

    (iii) determining a distance score for each pairwise comparison;

    (iv) generating a listing of lowest distance scores, wherein the lowest distance scores correspond to genes or gene fragments having the highest similarity to the gene of interest; and

    (i) generating an output display comprising the listing of genes or gene fragments having the lowest distance scores.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×