Method and Apparatus for Automatic Comparison of Data Sequences
1 Assignment
0 Petitions
Accused Products
Abstract
The invention is concerned with a method and an apparatus for automatic comparison of at least two data sequences characterized in—an evaluation of a local relationship between any pair of subsequences in two or more sequences; —an evaluation of a global relationship by means of aggregation of the evaluations of said local relationships.
81 Citations
26 Claims
-
1-13. -13. (canceled)
-
14. A method for automatic comparison of at least two data sequences comprising the steps of
performing an evaluation of a local relationship between any pair of subsequences in two or more sequences; - and
performing an evaluation of a global relationship by aggregation of a plurality of evaluations of said local relationships. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
- and
-
25. Sew):
- An apparatus for the comparison of data sequences comprising;
means for representing data sequences in a data structure selected from one of; a hash table or indexed table; a trie or compacted trie; a suffix tree or suffix array; and a generalized suffix tree or generalized suffix array. means for performing an evaluation of a local relationship between any pair of subsequences in said data sequences; means for performing an evaluation of a global relationship by aggregation of a plurality of evaluations of said local relationships; and means for computation of a totality of the local and global relationship, wherein at least one of the first and second data sequences comprise one or more of symbols, images, text, ASCII characters, genetic data, protein data, bytes, binary data, and tokens as objects for which the local relationship is evaluated.
- An apparatus for the comparison of data sequences comprising;
-
26. A system for processing and analysis of data sequences comprising:
-
means for input of data sequences comprising a data structure selected from one of; a hash table or indexed table; a trie or compacted trie; a suffix tree or suffix array; and a generalized suffix tree or generalized suffix array, means for comparison of data sequences comprising one of; Manhattan or taxicab distance; Euclidean distance; Minkowski distance; Canberra distance; Chi-Square distance; Chebyshev distance; Geodesic distance; Jensen or symmetric Kullback-Leibler divergence; Position-independent Hamming distance; 1st and 2nd Kulczynski similarity coefficient; Czekanowski or Sorensen-Dice similarity coefficient; Jaccard similarity coefficient; Simpson similarity coefficient; Sokal-Sneath or Anderberg similarity coefficient; Otsuka or Ochiai similarity coefficient; and Braun-Blanquet similarity coefficient; means for analysis of data sequences including classification, regression, novelty detection, ranking, clustering, and structural inference; and means for reporting of results of the analysis.
-
Specification