×

System and method for indexing weighted-sequences in large databases

  • US 20050114298A1
  • Filed: 11/26/2003
  • Published: 05/26/2005
  • Est. Priority Date: 11/26/2003
  • Status: Active Grant
First Claim
Patent Images

1. A method of generating an index for a sequence that supports a non-contiguous subsequence match, comprising:

  • receiving a sequence;

    receiving a window size;

    encoding the sequence into a weighted-sequence;

    encoding the weighted sequence into one or more one-dimensional sequences, wherein the length of each of the one or more one-dimensional sequences is less than the window size;

    inserting each of the one or more one-dimensional sequences into a trie structure; and

    generating the index, comprising;

    generating a current sequential ID and a maximum sequential ID pair for generating each of the one or more trie nodes, wherein the current sequential ID of any descendant of a given trie node is between the current sequential ID of the given trie node and the maximum sequential ID;

    generating an iso-depth link for each unique symbol in each of the one or more one-dimensional sequences, wherein the iso-depth link comprises trie nodes under the symbol; and

    generating an offset list comprising an original position of each of the one or more subsequences in the weighted-sequence.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×