System and method for classification of documents
First Claim
1. A method for classification of documents, comprising the steps of:
- receiving a classify instruction from a client for initiating a classification of documents, the classify instruction identifying input documents to be classified, a classification profile, and anchor values;
retrieving the classification profile and input documents;
extracting input values from each input document based on the anchor values;
structuring the input values according to a search schema identified in the classification profile;
performing similarity searches for determining similarity scores between each database document and each input document;
performing external analysis of the database documents for determining external analytic scores;
classifying the database documents based on profile, external analytic scores and the similarity scores using classes and rules identified in the classification profile; and
notifying the client of completion of the classify command.
4 Assignments
0 Petitions
Accused Products
Abstract
The invention provides a classification engine for classifying documents that makes use of functions included in a similarity search engine. The classification engine executes a classify command from a client that makes use of similarity search results, and rules files, classes files, and a classification profile embedded in the classification command. When the classification receives a classify command from a client, it retrieves a classification profile and input documents to be classified, sends extracted values from the input documents based on anchor values to a XML transformation engine to obtain a search schema, requests a similarity search by a search manager to determine the similarity between input documents and anchor values, and classifies the input documents according to the rules files, classes files, and the classification profile. The client is then notified that the classify command has been completed and the classification results are stored in a database.
-
Citations
33 Claims
-
1. A method for classification of documents, comprising the steps of:
-
receiving a classify instruction from a client for initiating a classification of documents, the classify instruction identifying input documents to be classified, a classification profile, and anchor values;
retrieving the classification profile and input documents;
extracting input values from each input document based on the anchor values;
structuring the input values according to a search schema identified in the classification profile;
performing similarity searches for determining similarity scores between each database document and each input document;
performing external analysis of the database documents for determining external analytic scores;
classifying the database documents based on profile, external analytic scores and the similarity scores using classes and rules identified in the classification profile; and
notifying the client of completion of the classify command. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A system for classification of documents, comprising:
-
a classification engine for receiving a classify instruction from a client for initiating a classification of documents, the classify instruction identifying input documents to be classified, a classification profile, and anchor values;
the classification engine for retrieving the classification profile and input documents from a virtual document manager;
the classification engine for extracting input values from each input document based on the anchor values;
an XML transformation engine for structuring the input values according to a search schema identified in the classification profile;
a search manager for performing similarity searches for determining similarity scores between each database document and each input document;
external analytics for performing external analysis of the database documents for determining external analytic scores;
the classification engine for classifying the database documents based on profile, external analytic scores and the similarity scores using classes and rules identified in the classification profile; and
means for notifying the client of completion of the classify command. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28)
-
-
29. A system for classification of documents comprising:
-
a classification engine for accepting a classify command from a client, retrieving a classification profile, classifying documents based on external analytic scores, similarity scores, rules and classes, storing document classification results in a database, and notifying the client of completion of the classify command;
a virtual document manager for providing input documents;
an XML transformation engine for structuring the input values according to a search schema identified in the classification profile;
a search manager for performing similarity searches for determining similarity scores between each database document and each input document; and
external analytics for determining external analytic scores. - View Dependent Claims (30, 31)
-
-
32. A method for classification of documents, comprising:
-
receiving a classify command from a client, the classify command designating input document elements for names and search schema, anchor document structure, external analytics and values to be used as classification filters, and a classification profile;
retrieving the designated classification profile, the classification profile designating classes files for name, rank and score thresholds, rules files for nested conditions, properties, schema mapping, score threshold ranges and number of required documents, and class rules maps for class identification, class type, rule identification, description, property, score threshold ranges and document count;
retrieving the designated search documents;
identifying a schema mapping file for each input document;
determining a degree of similarity between each input document and anchor document;
determining analytic scores for each input document;
classifying the input documents according to the designated classes files, analytic scores and rules files;
creating and storing a classification results file in a database; and
notifying the client of completion of the classify command. - View Dependent Claims (33)
-
Specification