System and method for classification of documents
First Claim
1. A method for classification of documents, comprising the steps of:
- receiving a classify command from a client for initiating a classification of documents, the classify command identifying input documents to be classified, a classification profile, and an anchor document containing anchor values for identifying input document values to be used as a search criteria;
retrieving the classification profile and input documents;
extracting input values from each input document based on the anchor values;
structuring the input values according to a source schema identified in the classification profile;
performing similarity searches to determine similarity scores between one or more target database documents and the input values of each input document;
performing external analysis of the input documents to determine external analytic scores between one or more target database documents and the input values of each input document;
applying rules to the similarity scores and external analytic scores to determine rules results;
classifying the input documents into classes based on the classification profile, external analytic scores, similarity scores and rules results identified in the classification profile; and
sending the client a response at the completion of the classify command.
4 Assignments
0 Petitions
Accused Products
Abstract
The invention provides a classification engine for classifying documents that makes use of functions included in a similarity search engine. The classification engine executes a classify command from a client that makes use of similarity search results, and rules files, classes files, and a classification profile embedded in the classification command. When the classification receives a classify command from a client, it retrieves a classification profile and input documents to be classified, sends extracted values from the input documents based on anchor values to a XML transformation engine to obtain a search schema, requests a similarity search by a search manager to determine the similarity between input documents and anchor values, and classifies the input documents according to the rules files, classes files, and the classification profile. The client is then notified that the classify command has been completed and the classification results are stored in a database.
33 Citations
30 Claims
-
1. A method for classification of documents, comprising the steps of:
-
receiving a classify command from a client for initiating a classification of documents, the classify command identifying input documents to be classified, a classification profile, and an anchor document containing anchor values for identifying input document values to be used as a search criteria; retrieving the classification profile and input documents; extracting input values from each input document based on the anchor values; structuring the input values according to a source schema identified in the classification profile; performing similarity searches to determine similarity scores between one or more target database documents and the input values of each input document; performing external analysis of the input documents to determine external analytic scores between one or more target database documents and the input values of each input document; applying rules to the similarity scores and external analytic scores to determine rules results; classifying the input documents into classes based on the classification profile, external analytic scores, similarity scores and rules results identified in the classification profile; and sending the client a response at the completion of the classify command. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A system for classification of documents, comprising:
-
a classification engine for receiving a classify command from a client for initiating a classification of documents, the classify command identifying input documents to be classified, a classification profile, and an anchor document containing anchor values for identifying input document values to be used as a search criteria; the classification engine for retrieving the classification profile and input documents from a virtual document manager; the classification engine for extracting input values from each input document based on the anchor values; an XML transformation engine for structuring the input values according to a source schema identified in the classification profile; a search manager for performing similarity searches to determine similarity scores between one or more target database documents and the input values of each input document; external analytics for performing external analysis of the input documents to determine external analytic scores between one or more target database documents and the input values of each input document; the classification engine for applying rules to the similarity scores and external analytic scores to determine rules results; the classification engine for classifying the input documents into classes based on the classification profile, external analytic scores, similarity scores and rules results identified in the classification profile; and means for sending the client a response at the completion of the classify command. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28)
-
-
29. A method for classification of documents, comprising:
-
receiving a classify command from a client, the classify command designating input document elements for names and search schema, anchor document structure, external analytics and values to be used as classification filters, and a classification profile; retrieving the designated classification profile, the classification profile designating classes files for name, rank and score thresholds, rules files for nested conditions, properties, schema mapping, score threshold ranges and number of required documents, and class rules maps for class identification, class type, rule identification, description, property, score threshold ranges and document count; retrieving the designated input documents; identifying a schema mapping file for each input document; determining a degree of similarity between each input document and one or more anchor documents; determining analytic scores for each input document; classifying the input documents according to the designated classes files, analytic scores and rules files; creating and storing a classification results file in a database; and notifying the client of completion of the classify command. - View Dependent Claims (30)
-
Specification