Systems and methods for data indexing and processing
First Claim
1. A method for associating a document file with a record in a reference database, the method comprising:
- receiving the document file, the document file comprising unstructured data related to a record in the reference database;
organizing data extracted from the unstructured data in the document file into an array of strings;
obtaining a first set of strings by filtering at least a portion of the array of strings using at least one of;
string position, position of a portion of a string, string value, value of a portion of a string, string format, format of a portion of a string, a property of one or more characters within a string, and string length;
comparing the first set of strings from the array of strings against a comparison reference database comprising a plurality of records from the database, wherein a record comprises at least one data field element;
dynamically generating a match pattern by selecting, from results of comparing the first set of strings from the array of strings against the comparison reference database, a set of matches to one or more data field elements within a record from the plurality of records in the comparison reference database to form the match pattern;
determining a number of occurrences of the match pattern within records from the plurality of records in the comparison reference database; and
responsive to the number of occurrences of the match pattern within records from the plurality of records in the comparison reference database being below a threshold number, associating the document file with the record corresponding with the set of matches from which the match pattern was formed.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods are disclosed that allow for indexing, processing, or both of information from physical media or electronic media, which may be received from a plurality of sources. In embodiments, a document file may be matched using pattern matching methods and may include comparisons with a comparison reference database to improve or accelerate the indexing process. In embodiments, information may be presented to a user as potential matches thereby improving manual indexing processes. In embodiments, one or more additional actions may occur as part of the processing, including without limitation, association additional data with a document file, making observations from the document file, notifying individuals, creating composite messages, and billing events. In an embodiment, data from a document file may be associated with a key word, key phrase, or word frequency value that enables adaptive learning so that unindexed data may be automatically indexed based on user interaction history.
-
Citations
32 Claims
-
1. A method for associating a document file with a record in a reference database, the method comprising:
-
receiving the document file, the document file comprising unstructured data related to a record in the reference database; organizing data extracted from the unstructured data in the document file into an array of strings; obtaining a first set of strings by filtering at least a portion of the array of strings using at least one of;
string position, position of a portion of a string, string value, value of a portion of a string, string format, format of a portion of a string, a property of one or more characters within a string, and string length;comparing the first set of strings from the array of strings against a comparison reference database comprising a plurality of records from the database, wherein a record comprises at least one data field element; dynamically generating a match pattern by selecting, from results of comparing the first set of strings from the array of strings against the comparison reference database, a set of matches to one or more data field elements within a record from the plurality of records in the comparison reference database to form the match pattern; determining a number of occurrences of the match pattern within records from the plurality of records in the comparison reference database; and responsive to the number of occurrences of the match pattern within records from the plurality of records in the comparison reference database being below a threshold number, associating the document file with the record corresponding with the set of matches from which the match pattern was formed. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A system for associating a document file with a record in a reference database, the system comprising:
-
one or more processors communicatively coupled to at least one computer-readable medium storing one or more sequences of instructions, wherein execution of the one or more sequences of instructions by one or more processors causes the one or more processors to associate a document file by performing the steps comprising; receiving the document file, the document file comprising unstructured data related to a record in the reference database; organizing data extracted from the unstructured data in the document file into an array of strings; obtaining a first set of strings by filtering at least a portion of the array of strings using at least one of;
string position, position of a portion of a string, string value, value of a portion of a string, string format, format of a portion of a string, a property of one or more characters within a string, and string length;comparing the first set of strings from the array of strings against a comparison reference database comprising a plurality of records wherein a record comprises at least one data field element; dynamically generating a match pattern by selecting, from results of comparing the first set of strings from the array of strings against the comparison reference database, a set of matches to one or more data field elements within a record from the plurality of records in the comparison reference database to form the match pattern; determining a number of occurrences of the match pattern within records from the plurality of records in the comparison reference database; and responsive to the number of occurrences of the match pattern within records from the plurality of records in the comparison database being below a threshold number, associating the document file with the record corresponding with the set of matches from which the match pattern was formed. - View Dependent Claims (15, 16, 17, 18, 19)
-
-
20. A non-transitory computer-readable medium comprising one or more sets of instructions which, when executed by one or more processors, causes the one or more processors to perform a method for associating a document file with a record in a reference database, the method comprising:
-
receiving the document file, the document file comprising unstructured data related to a record in the reference database; organizing data extracted from the unstructured data in the document file into an array of strings; obtaining a first set of strings by filtering at least a portion of the array of strings using at least one of;
string position, position of a portion of a string, string value, value of a portion of a string, string format, format of a portion of a string, a property of one or more characters within a string, and string length;comparing the first set of strings from the array of strings against a comparison reference database comprising a plurality of records from the database, wherein a record comprises at least one data field element; dynamically generating a match pattern by selecting, from results of comparing the first set of strings from the array of strings against the comparison reference database, a set of matches to one or more data field elements within a record from the plurality of records in the comparison reference database to form the match pattern; determining a number of occurrences of the match pattern within records from the plurality of records in the comparison reference database; and responsive to the number of occurrences of the match pattern within records from the plurality of records in the comparison reference database being below a threshold number, associating the document file with the record corresponding with the set of matches from which the match pattern was formed. - View Dependent Claims (21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32)
-
Specification