×

Sequence-centric scientific information management

  • US 9,183,349 B2
  • Filed: 06/08/2010
  • Issued: 11/10/2015
  • Est. Priority Date: 12/16/2005
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method of integrating a sequence-centric feature set into a knowledge base on a storage device comprising sequence-centric feature sets and gene-centric feature sets, the method comprising:

  • receiving by one or more processors of a computer system a sequence-centric feature set provided by a user, wherein the sequence-centric feature set comprises a plurality of sequence regions and associated statistics, wherein the plurality of sequence regions comprise one or more SNPs, methylated regions, or genomic variations;

    mapping by the one or more processors of the computer system the plurality of sequence regions to genes within the knowledge base to provide a set of mapped genes for the received sequence-centric feature set, wherein the plurality of sequence regions and the genes within the knowledge base are related by genomic coordinate, physical proximity, haplotype, function, or phenotype;

    mapping by the one or more processors of the computer system the plurality of sequence regions to other sequence regions within the knowledge base to provide a set of mapped sequence regions for the received sequence-centric feature set, wherein the plurality of sequence regions and the genes within the knowledge base are related by genomic coordinate, physical proximity, haplotype, function, or phenotype;

    providing ranks of the set of mapped sequence regions in the received sequence-centric feature set and in other sequence-centric feature sets in the knowledge base, wherein the other sequence-centric feature sets comprise a plurality of sequence regions and associated statistics;

    performing by the one or more processors of the computer system iterative rank based processes to calculate sequence-sequence scores indicating correlations between the received sequence-centric feature set and other sequence-centric feature sets in the knowledge base using the ranks of the set of mapped sequence regions;

    providing ranks of the set of mapped genes in the received sequence-centric feature set and in the gene-centric feature sets in the knowledge base, wherein the gene-centric feature sets comprise one or more of genes ranked by activity and microarray-based gene expression data;

    performing by one or more processors of the computer system iterative rank based processes to calculate sequence-gene scores indicating the correlations between the received sequence-centric feature set and the gene-centric feature sets using the ranks of the set of mapped genes;

    storing the received sequence-centric feature set, the sequence-sequence scores, and the sequence-gene scores on the storage device;

    receiving a query sequence region or a query gene as a query input; and

    displaying information based on one or more sequence-sequence scores or one or more sequence-gene scores that correspond to the query sequence region or the query gene.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×