Systems and Methods for Data Indexing and Processing
First Claim
1. A non-transitory computer-readable medium or media comprising one or more sequences of instructions which, when executed by one or more processors, causes steps to be performed comprising:
- obtaining a first set of criteria for identifying a document characteristic in a document file comprising unstructured data, wherein each criterion in the first set of criteria comprises one or more conditions and is associated with a document characteristic, the first set of criteria being from a first source;
obtaining a second set of criteria for identifying a document characteristic in a document file comprising unstructured data, wherein each criterion in the second set of criteria comprises one or more conditions and is associated with a document characteristic, the second set of criteria being from a second source; and
comparing the first and second sets of criteria to generate a set of match criteria for use in identifying one or more document characteristics for a document file comprising unstructured data, wherein each criterion in the set of match criteria comprises one or more conditions and is associated with a document characteristic.
0 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods are disclosed that allow for indexing, processing, or both of information from physical media or electronic media, which may be received from a plurality of sources. In embodiments, a document file may be matched using pattern matching methods and may include comparisons with a comparison reference database to improve or accelerate the indexing process. In embodiments, information may be presented to a user as potential matches thereby improving manual indexing processes. In embodiments, one or more additional actions may occur as part of the processing, including without limitation, association additional data with a document file, making observations from the document file, notifying individuals, creating composite messages, and billing events. In an embodiment, data from a document file may be associated with a key word, key phrase, or word frequency value that enables adaptive learning so that unindexed data may be automatically indexed based on user interaction history.
-
Citations
20 Claims
-
1. A non-transitory computer-readable medium or media comprising one or more sequences of instructions which, when executed by one or more processors, causes steps to be performed comprising:
-
obtaining a first set of criteria for identifying a document characteristic in a document file comprising unstructured data, wherein each criterion in the first set of criteria comprises one or more conditions and is associated with a document characteristic, the first set of criteria being from a first source; obtaining a second set of criteria for identifying a document characteristic in a document file comprising unstructured data, wherein each criterion in the second set of criteria comprises one or more conditions and is associated with a document characteristic, the second set of criteria being from a second source; and comparing the first and second sets of criteria to generate a set of match criteria for use in identifying one or more document characteristics for a document file comprising unstructured data, wherein each criterion in the set of match criteria comprises one or more conditions and is associated with a document characteristic. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A processor-implemented method for indexing a document file comprising:
-
receiving a document file, wherein the document file comprises a plurality of unstructured characters; organizing the plurality of unstructured characters into an array of strings; receiving at least a portion of a reference database from a client, wherein the reference database comprise a plurality of records wherein each record comprises at least one data field element; comparing a first set of strings from the array of strings against a comparison reference database obtained from the reference database; and responsive to at least a portion of the first set of strings exceeding a threshold match with at least a portion of a record in the comparison reference database, generating a structured message that associates the document file with the record. - View Dependent Claims (13, 14, 15)
-
-
16. A processor-implemented method for identifying a document characteristic comprising:
-
receiving, from a plurality of sources, a plurality of features for use in identifying one or more document characteristics of document files comprising unstructured data, wherein each feature comprises one or more elements and each feature is associated with a document characteristic; generating, from the plurality of features, a set of features and their associated document characteristics for use in identifying one or more characteristics of a document file comprising unstructured data; receiving a document file comprising unstructured data; comparing at least some of the features from the set of features with the document file comprising unstructured data; and responsive to a feature exceeding a threshold match with data in the document file, attributing the document characteristic associate the matching feature to the document file. - View Dependent Claims (17, 18, 19, 20)
-
Specification