×

Bambam: parallel comparative analysis of high-throughput sequencing data

  • US 9,652,587 B2
  • Filed: 05/25/2011
  • Issued: 05/16/2017
  • Est. Priority Date: 05/25/2010
  • Status: Active Grant
First Claim
Patent Images

1. A method of deriving a differential genetic sequence object, the method comprising:

  • accessing a genetic database storing a first set of genetic sequence strings and associated reads representing a first tissue and a second set of genetic sequence strings and associated reads representing a second tissue, wherein the first set and the second set include genomic location information, wherein the accessing is executed by a hardware processor;

    aligning the first set of genetic sequence strings and the second set of genetic sequence strings using the genomic location information in at least one of the first set or the second set, the first set of genetic sequence strings and the second set of genetic sequence strings being analyzed against each other, wherein the aligning is executed by the hardware processor the analyzing comprising;

    determining base probabilities of possible locations of sequence reads in the first and second genetic sequence strings as a function of error rates of at least one sequencer;

    identifying a difference between the first set and the second set of genetic sequence strings by comparing genotypes from the first and the second sets that, overlapping at a particular genomic position, maximize a likelihood probability function identifying the genotypes as being different and that are located at the particular genomic position, where the likelihood probability function operates as a probability distribution of a likelihood that unmapped sequence reads of both the first set, representing the first tissue, and the second set, representing the second tissue, align to possible junction sequences, modeled over the base probabilities and associated sequence reads;

    generating a local differential string that represents a difference between synchronized sub-strings of corresponding first and second sets of sequence strings within local alignment, based on the identifying the difference between the first set and the second set of genetic sequence strings by comparing the genotypes;

    updating a differential genetic sequence object in a differential sequence database with information according to the local differential string; and

    generating a patient specific clinical instruction based on information of the differential genetic sequence object.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×