AUTOMATED DOCUMENT IDENTIFICATION AND LANGUAGE DICTATION RECOGNITION SYSTEMS AND METHODS FOR USING THE SAME
First Claim
1. A computerized method for automated identification of verbal records to improve a textual transcript, the method comprising the steps of:
- selecting a plurality of verbal records from a database, the each of the plurality of verbal records comprising supporting information;
processing the each of the verbal records into a feature vector comprising a plurality of verbal record feature vectors;
creating a plurality of basic classifiers;
evaluating the each of the plurality of basic classifiers;
creating a plurality of boosted classifiers, the each of the plurality of boosted classifiers being a combination of the each of the plurality of basic classifiers;
testing the performance of the each of the boosted classifiers on a test set of training vectors, and determining which of the each of the boosted classifiers performed the best;
adding one of the plurality of basic classifiers to the boosted classifier, based on a first vector weight;
testing the performance of the boosted classifiers;
adjusting the first vector weight and testing the performance of the each of the boosted classifiers on the test set of training vectors, and determining which of the each of the boosted classifiers performs the best;
selecting a best boosted classifier; and
saving the best boosted classifier and supporting structures.
0 Assignments
0 Petitions
Accused Products
Abstract
In at least one exemplary embodiment for automated document identification and language dictation recognition systems, the system comprises a database capable of receiving a plurality of verbal records, the verbal record comprising at least one identifier and at least one verbal feature and a processor operably coupled to the database, where the processor has and executes a software program. The processor being operational to identify a subset of the plurality of verbal records from the database, extract at least one verbal feature from the identified records, analyze the at least one verbal feature of the subset of the plurality of verbal records, process the subset of the plurality of records using the analyzed feature according to at least one reasoning approach, generate a processed verbal record using the processed subset of the plurality of records, and deliver the processed verbal record to a recipient. The processor being further operational to extract features for a pool of training documents, to turn each transcription job into a feature vector which can be used by a traditional classifier, creating classifiers with different parameters in order to explore the best possible strategy, evaluating performance of all classifiers, creating a boosting classifier, calculating performance statistics, and operating the automatic document identifier for all documents.
-
Citations
18 Claims
-
1. A computerized method for automated identification of verbal records to improve a textual transcript, the method comprising the steps of:
-
selecting a plurality of verbal records from a database, the each of the plurality of verbal records comprising supporting information; processing the each of the verbal records into a feature vector comprising a plurality of verbal record feature vectors; creating a plurality of basic classifiers; evaluating the each of the plurality of basic classifiers; creating a plurality of boosted classifiers, the each of the plurality of boosted classifiers being a combination of the each of the plurality of basic classifiers; testing the performance of the each of the boosted classifiers on a test set of training vectors, and determining which of the each of the boosted classifiers performed the best; adding one of the plurality of basic classifiers to the boosted classifier, based on a first vector weight; testing the performance of the boosted classifiers; adjusting the first vector weight and testing the performance of the each of the boosted classifiers on the test set of training vectors, and determining which of the each of the boosted classifiers performs the best; selecting a best boosted classifier; and saving the best boosted classifier and supporting structures. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system for automated identification of verbal records to improve a textual transcript, the system comprising:
-
a database capable of receiving a plurality of verbal records, the each of the plurality of verbal records comprising supporting information; a processor operably coupled to the database, and configured to; process the each of the verbal records into a feature vector comprising a plurality of verbal record feature vectors; create a plurality of basic classifiers; evaluate the each of the plurality of basic classifiers; create a plurality of boosted classifiers, the each of the plurality of boosted classifiers being a combination of the each of the plurality of basic classifiers; test the performance of the each of the boosted classifiers on a test set of training vectors, and determining which of the each of the boosted classifiers performed the best; add one of the plurality of basic classifiers to the boosted classifier, based on a first vector weight; test the performance of the boosted classifiers; adjust the first vector weight and testing the performance of the each of the boosted classifiers on the test set of training vectors, and determining which of the each of the boosted classifiers performs the best; select a best boosted classifier; and save the best boosted classifier and supporting structures. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
-
-
18. The system of claim 19, wherein the processor is further configured to take into account any changing weights of the each of the basic classifiers and the corresponding training vectors.
Specification