×

System and method for searching strings of records

  • US 7,904,429 B2
  • Filed: 05/08/2005
  • Issued: 03/08/2011
  • Est. Priority Date: 05/07/2004
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method of detecting the inclusion of records in an input string of words comprising:

  • a. pre-processing the records such that at least a portion of the records are represented by a string;

    said string comprising a plurality of sections wherein each of the said sections (chunks) comprises a pre-defined number of elementary words, in the manner that when a chunk is the first chunk in the chunk representation of a record a “

    Begin of Record”

    attribute is assigned to the first chunk and when a chunk is the last chunk in the chunk representation of a record, an “

    End of Record”

    attribute is assigned to the last chunk;

    b. searching said string in the manner that the input string is divided into a plurality of chunks and each of said chunks is searched in the records so that if a searched chunk is found to be present in at least one record, a Partial Match Flag is set according to a logic calculation; and

    ,c. calculating an Incremental Hash Function (IHF) for each input chunk;

    characterized by that whenever a chunk from the input string is found to be in at least one of the records, and the said chunk is associated with an End of Record and the Partial Match is set for this chunk (End of Record chunk), the difference values between the value of the IHF at that “

    End of Record”

    chunk and the value of the IHF at each of the previous chunks of the said string, to which a “

    Begin of Record”

    is associated (Begin of Record Chunk), is calculated and compared with all pre-recorded values Δ

    I, calculated during the pre-processing for each record having the said End of Record chunk as last chunk, and associated to the said End of Record chunk;

    wherein calculating the Incremental Hash Function (IHF)includes;

    i. starting with an arbitrary initial value;

    ii. processing sequentially each chunk of the record; and

    ,iii. calculating the IHF for each of said processed chunks as an arbitrarily first predefined function of the said chunk and the previously calculated IHF value where the said chunk has not been associated with a “

    Begin of Record”

    attribute, and as a reversible operation of a second predefined function of the said chunk and the previously calculated IHF value where the said chunk has been associated with a “

    Begin of Record”

    attribute;

    d. calculating a Δ

    I function for each record as the inverse of the said reversible operation of the IHF value at the last chunk of the record, and the said initial value of the IHF for that record;

    e. associating to each chunk having an “

    End of Record”

    attribute, the values of all Δ

    I function for all records for which that chunk is the last chunk;

    f. representing the input string as a string of chunks (input string of chunks);

    h. processing sequentially each chunk of the said input string of chunks, and for each said chunk setting a Partial Match Flag if the said chunk appears in the said list of chunks and if either the said Chunk has a “

    Begin of Record”

    attribute, or the Partial Match Flag was previously set;

    if said chunk has an “

    End of Record”

    attribute, calculating a set of Δ

    J function values, each one being the result of the inverse of the said reversible operation of the IHF value at the said chunk of the input string of chunks, and the IHF value obtained immediately before calculating the IHF for a chunk having a “

    Begin of Record”

    attribute; and

    g. asserting a “

    Probable Match”

    in case the said chunk has an “

    End of Record”

    attribute and at least one of the Δ

    J functions from the said set of Δ

    J functions equals one of the Δ

    I functions associated to the said chunk wherein a “

    Probable Match”

    indicates a high probability that at least one record is included in the input string.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×