×

System and methods for data indexing and processing

  • US 20070013968A1
  • Filed: 07/14/2006
  • Published: 01/18/2007
  • Est. Priority Date: 07/15/2005
  • Status: Active Grant
First Claim
Patent Images

1. A method for indexing a document file comprising a plurality of characters arranged into an array of strings, the method comprising:

  • filtering the array of strings to obtain a set of strings;

    for each string in the set of strings, creating a first sequence list comprising a substring starting at a first character position in the string and a second sequence list comprising a substring starting at a second character position in the string;

    generating a comparison reference database by querying the first and second sequence lists against a reference database, the reference database comprise a plurality of records and each record comprises a plurality of data fields;

    for each record in the comparison reference database, generating a first set of substrings based upon a first set of data fields from the plurality of data fields in the record;

    comparing the first set of substrings against the set of strings to identify a longest substring match, if any, for each of the first set of data fields from the record;

    filtering the comparison reference database to create a second comparison reference database by selecting each record that has a longest substring match for one or more data fields from the first set of data fields. assigning a point value for each match found in a record and summing the point value for the record; and

    responsive to a record having a total point value exceeding a threshold match value, associating the document file with that record.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×