Methods and systems for detecting sequence variants

US 9,116,866 B2
Filed: 09/30/2013
Issued: 08/25/2015
Est. Priority Date: 08/21/2013
Status: Active Grant

First Claim

Patent Images

1. A method for identifying a mutation in proximity to a structural variation in a sequence, the method comprising the steps of:

obtaining a plurality of nucleic acid sequence reads, wherein at least one nucleic acid read comprises a mutation;

comparing said reads to a reference sequence construct, wherein said reference sequence construct is stored in computer memory as a directed acyclic graph comprising at least two alternative sequences at a position in the reference sequence construct, one of which is a structural variation,scoring sequence overlaps for each nucleic acid read against the reference sequence construct;

aligning each read to a location on the construct such that the score for each read is maximized; and

identifying the mutation as being aligned within 100 bp or fewer of the structural variation.

View all claims

12 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The invention provides methods for identifying rare variants near a structural variation in a genetic sequence, for example, in a nucleic acid sample taken from a subject. The invention additionally includes methods for aligning reads (e.g., nucleic acid reads) to a reference sequence construct accounting for the structural variation, methods for building a reference sequence construct accounting for the structural variation or the structural variation and the rare variant, and systems that use the alignment methods to identify rare variants. The method is scalable, and can be used to align millions of reads to a construct thousands of bases long, or longer.

84 Citations

View as Search Results

15 Claims

1. A method for identifying a mutation in proximity to a structural variation in a sequence, the method comprising the steps of:
- obtaining a plurality of nucleic acid sequence reads, wherein at least one nucleic acid read comprises a mutation;
  
  comparing said reads to a reference sequence construct, wherein said reference sequence construct is stored in computer memory as a directed acyclic graph comprising at least two alternative sequences at a position in the reference sequence construct, one of which is a structural variation,scoring sequence overlaps for each nucleic acid read against the reference sequence construct;
  
  aligning each read to a location on the construct such that the score for each read is maximized; and
  
  identifying the mutation as being aligned within 100 bp or fewer of the structural variation.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
- - 2. The method of claim 1, further comprising assembling the nucleic acid reads to each other based upon the alignment of the nucleic acid reads with respect to the reference sequence construct.
  - 3. The method of claim 1, wherein the structural variation is at least 100 bp long.
  - 4. The method of claim 1, wherein the reference sequence construct further comprises at least two additional alternative sequences at a second position in the reference construct, and one of the additional alternative sequences comprises a sequence matching the mutation.
  - 5. The method of claim 4, wherein the first and second positions are separated by 100 bp or fewer.
  - 6. The method of claim 1, wherein the reference sequence construct further comprises at least two additional alternative sequences at a second position in the reference construct, and neither of the additional alternative sequences comprises a sequence matching the mutation.
  - 7. The method of claim 6, wherein the first and second positions are separated by 100 bp or fewer.
  - 8. The method of claim 1, wherein the structural variation is 1 kilobase to 3 megabases in length.
  - 9. The method of claim 1, wherein the mutation was not previously identified in a variant call format (VCF) file.
  - 10. The method of claim 1, wherein the mutation was previously identified in a variant call format (VCF) file.
  - 11. The method of claim 1, wherein the structural variation is selected from the group consisting of deletions, duplications, copy-number variations, insertions, inversions, and translocations.
  - 12. The method of claim 1, wherein the mutation is selected from the group consisting of a deletion, a duplication, an inversion, an insertion, and a single nucleotide polymorphism.
  - 13. The method of claim 1, wherein the mutation does not comprise a sequence matching the reference construct.
  - 14. The method of claim 1, wherein the reference sequence construct comprises a genome of an organism.
  - 15. The method of claim 1, wherein the reference sequence comprises a chromosome of an organism.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Seven Bridges Genomics, Inc.
Original Assignee
Seven Bridges Genomics, Inc.
Inventors
Kural, Deniz
Primary Examiner(s)
Martinell, James

Application Number

US14/041,850
Publication Number

US 20150056613A1
Time in Patent Office

694 Days
Field of Search

None
US Class Current

1/1
CPC Class Codes

G16B 30/00   ICT specially adapted for s...

G16B 30/10   Sequence alignment; Homolog...

G16B 30/20   Sequence assembly

G16B 50/00   ICT programming tools or da...

Methods and systems for detecting sequence variants

First Claim

12 Assignments

0 Petitions

Accused Products

Abstract

84 Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Methods and systems for detecting sequence variants

First Claim

12 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

84 Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links