×

Method and system for comparing strings with entries of a lexicon

  • US 5,774,588 A
  • Filed: 06/07/1995
  • Issued: 06/30/1998
  • Est. Priority Date: 06/07/1995
  • Status: Expired due to Term
First Claim
Patent Images

1. A method of comparing strings with entries of a lexicon, comprising the steps of:

  • organizing entries of the lexicon by, for each entry;

    (a) forming an n-gram vector representing a lexicon entry;

    (b) folding said n-gram vector into a signature vector by combining multiple n-grams into bits;

    (c) creating a list of bits having the same n-grams as the bits of the signature vector, beginning with the bit occurring most frequently in the lexicon and continuing in descending order;

    (d) creating a partitioned vector whose element values are binary numbers whose digits represent the bits of the list of bits, partitioned into groups of digits forming the binary numbers, the digits being set or not set depending on whether the bit of the signature vector having the same n-grams is set;

    (e) establishing a plurality of buckets having addresses corresponding to all possible element values of the partitioned vector;

    (f) indexing said lexicon entry to the one or more of said buckets having an address corresponding to an element value of said lexicon entry'"'"'s partitioned vector;

    reducing the number of lexicon entries to be compared to a particular unverified string by;

    creating a partitioned vector for said unverified string according to steps (a)-(d);

    indexing said unverified string to those buckets whose addresses correspond to an element value of said unverified string'"'"'s partitioned vector; and

    comparing a representation of said unverified string with representations of only those lexicon entries that are indexed to the bucket addresses to which said unverified string is indexed.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×