Document indexing
First Claim
1. A processor-implemented method for indexing a document file comprising:
- receiving a document file, wherein the document file comprises a plurality of unstructured characters;
organizing the plurality of unstructured characters into an array of strings;
receiving at least a portion of a reference database from a client, wherein the reference database comprise a plurality of records wherein each record comprises at least one data field element;
comparing a first set of strings from the array of strings against a comparison reference database obtained from the reference database; and
responsive to at least a portion of the first set of strings exceeding a threshold match with at least a portion of a record in the comparison reference database, generating a structured message that associates the document file with the record.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods are disclosed that allow for indexing, processing, or both of information from physical media or electronic media, which may be received from a plurality of sources. In embodiments, a document file may be matched using pattern matching methods and may include comparisons with a comparison reference database to improve or accelerate the indexing process. In embodiments, information may be presented to a user as potential matches thereby improving manual indexing processes. In embodiments, one or more additional actions may occur as part of the processing, including without limitation, association additional data with a document file, making observations from the document file, notifying individuals, creating composite messages, and billing events. In an embodiment, data from a document file may be associated with a key word, key phrase, or word frequency value that enables adaptive learning so that unindexed data may be automatically indexed based on user interaction history.
74 Citations
18 Claims
-
1. A processor-implemented method for indexing a document file comprising:
-
receiving a document file, wherein the document file comprises a plurality of unstructured characters; organizing the plurality of unstructured characters into an array of strings; receiving at least a portion of a reference database from a client, wherein the reference database comprise a plurality of records wherein each record comprises at least one data field element; comparing a first set of strings from the array of strings against a comparison reference database obtained from the reference database; and responsive to at least a portion of the first set of strings exceeding a threshold match with at least a portion of a record in the comparison reference database, generating a structured message that associates the document file with the record. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system comprising:
-
one or more processors; and a non-transitory computer-readable medium or media comprising one or more sequences of instructions which, when executed by the one or more processors, causes steps to be performed comprising; receiving a document file, wherein the document file comprises a plurality of unstructured characters; organizing the plurality of unstructured characters into an array of strings; receiving at least a portion of a reference database from a client, wherein the reference database comprise a plurality of records wherein each record comprises at least one data field element; comparing a first set of strings from the array of strings against a comparison reference database obtained from the reference database; and responsive to at least a portion of the first set of strings exceeding a threshold match with at least a portion of a record in the comparison reference database, generating a structured message that associates the document file with the record. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A non-transitory computer-readable medium or media comprising one or more sequences of instructions which, when executed by one or more processors, causes steps to be performed comprising:
-
receiving a document file, wherein the document file comprises a plurality of unstructured characters; organizing the plurality of unstructured characters into an array of strings; receiving at least a portion of a reference database from a client, wherein the reference database comprise a plurality of records wherein each record comprises at least one data field element; comparing a first set of strings from the array of strings against a comparison reference database obtained from the reference database; and responsive to at least a portion of the first set of strings exceeding a threshold match with at least a portion of a record in the comparison reference database, generating a structured message that associates the document file with the record. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification