Method and Device For Detection of Splice Form and Alternative Splice Forms in Dna or Rna Sequences
2 Assignments
0 Petitions
Accused Products
Abstract
The invention relates to a method and a device for detection of splice sites in DNA or RNA sequences comprising three steps: a) examining a training set of sequences comprising DNA or RNA sequences with known splice sites by an automated, discriminative training device for detecting splicing patterns, especially in a predetermined window around the known splice sites; b) scanning a sequence comprising DNA or RNA sequences containing unknown splice sites for the occurrence of the splicing patterns detected in step a); and c) calculation of a cumulative splice score in dependence of a maximization of the margin between the true splice forms and all wrong splice forms in the sequence. The invention also relates to a method and a device for detection of splice forms and alternative splice forms in DNA or RNA sequences.
4 Citations
66 Claims
-
1-33. -33. (canceled)
-
34. A method for the detection of a splice form in a DNA or RNA sequences, comprising:
-
a) examining a training set of sequences comprising DNA or RNA sequences with known splice sites by an automated, discriminative training device for detecting splicing patterns in a predetermined window around the known splice sites; b) scanning a sequence comprising DNA or RNA sequences containing unknown splice sites for the occurrence of the splicing patterns detected in step a); and c) calculating automatically a splice score in dependence of a maximization of the margin between the scores of true splice forms and all wrong splice forms in the sequence, wherein true splice forms refer to known splice forms and wrong splice forms refer to variations of known splice forms.
-
-
35. A method for the identification of one splice form and/or several alternative splice forms each comprising predictions of exon locations in DNA or RNA sequences, comprising:
-
a) examining a training set of DNA or RNA sequences with putative splice sites by an automated, discriminative training device for detecting splicing patterns using predetermined windows around the putative splice sites, wherein the splicing patterns can include information of alternative splice events, such as exon skipping or intron retention, alternative exon start or end usage or existence of regulative elements; b) examining a second training set of DNA or RNA sequences with putative splice forms by an automated, discriminative training device using splice patterns detected in step a), leading to a calculation device to automatically assign scores to a splice form and/or a group of alternative splice forms in dependence of the maximization of the margin between the putative splice forms or groups of them and putatively wrong splice forms of sequences or groups of them in the training set, wherein a Large Margin based Learning algorithm is applied; c) scanning a sequence comprising RNA or DNA with unknown and/or putative splice sites for the occurrence of the splicing patterns detected in step a); and d) predicting a splice form or group of alternative splice forms, using the device that assigns scores in dependence of the result of step c), in dependence of the said scores by maximizing or minimizing a function of the scores, comprising a set of splice forms associated with a RNA or DNA sequence when used to identify several alternative or only one mRNAs and/or proteins associates with a RNA or DNA sequence. - View Dependent Claims (36, 37, 38, 39)
-
-
40. A method for the detection of at least one splice form and/or at least one alternative splice form in RNA and DNA sequences, each comprising predictions of exon locations in DNA or RNA sequences, comprising:
-
a) examining a first training set of DNA or RNA sequences with putative splice sites by an automated training device for detecting splicing patterns; b) examining a second training set of DNA or RNA sequences with putative splice forms by an automated, discriminative training device using splice patterns detected in step a), leading to an automatic assignment of scores to at least one splice form and/or a group of alternative splice forms by a calculation device; c) scanning a sequence comprising RNA or DNA with unknown and/or putative splice sites for the occurrence of the splicing pattern(s) detected in step a); and d) calculating at least one splice form and/or at least one alternative splice form in dependence of the step b) assigned scores by using the calculation device and in dependence of the results obtained in step c), wherein at least one set of splice forms associated with a RNA or DNA sequence is provided. - View Dependent Claims (41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52)
-
-
53. A device for the detection of at least one splice site in a DNA or RNA, comprising:
-
a) an automated, discriminative training device for detecting splicing patterns in a predetermined window around the known splice sites, in a training set of sequences comprising EST, RNA sequence and/or DNA with known splice sites; b) a scanning device for scanning another sequence comprising DNA or RNA sequences containing unknown splice sites for the occurrence of the splicing patterns detected in step a); and c) a calculation device for automatically calculating a splice score in dependence of a maximization of the margin between the true splice forms and all wrong splice forms.
-
-
54. A device for the detection of at least one splice form in a DNA or RNA sequence, comprising:
-
a) an automated, discriminative training device for detecting splicing patterns in a predetermined window around putative splice sites in a training set comprising RNA or DNA sequences with putative splice sites, wherein splicing patterns can include information about alternative splice events such as exon skipping or intron retention, alternative exon start or end usage; b) a discriminative training device leading to a calculation device that automatically assigns scores to a splice form and/or a group of splice forms in dependence of the maximization of the margin between putative splice forms or groups of them and putatively wrong splice forms associated with sequences in a second training set of DNA or RNA sequences with putative splice forms; c) a scanning device for scanning a RNA and/or DNA sequence containing unknown and/or putative splice sites for the occurrence of the splicing patterns detected by the device in step a); and d) a calculation device for automatically calculating a score generated by the device in step b) to splice forms and/or groups of splice forms in a RNA and/or DNA sequence in dependence of the device in step c), wherein it is used to identify a set of splice forms such as mRNAs and/or proteins associated to a RNA or DNA sequence.
-
-
55. A device for the detection of at least one splice form in a DNA or RNA sequence, comprising:
-
a) an automated training device for detecting splicing patterns in a training set comprising RNA or DNA sequences with putative splice sites; b) a discriminative training device leading to a calculation device automatically assigning scores to at least one splice form and/or a group of splice forms and putatively wrong splice forms associated with sequences in a second training set of RNA or DNA sequences with putative splice forms; c) a scanning device for scanning a RNA and/or DNA sequence containing unknown and/or putative splice sites for the occurrence of the splicing pattern(s) detected in step a); and d) a calculation device for automatically calculating a score generated by the device in step b) of at least one splice form and/or groups of splice forms in a RNA or DNA sequence in dependence on the device in c). - View Dependent Claims (56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66)
-
Specification