Automated identification of verbal records using boosted classifiers to improve a textual transcript

US 10,224,036 B2
Filed: 06/15/2017
Issued: 03/05/2019
Est. Priority Date: 10/05/2010
Status: Active Grant

First Claim

Patent Images

1. A computerized method for automated identification of audio records to improve a textual transcript, the method comprising the steps of:

selecting a plurality of audio records from a database, the each of the plurality of audio records comprising supporting information;

obtaining a first speech recognition output of the each of the plurality of audio records using a pooled language model;

obtaining a second speech recognition output of the each of the plurality of audio records using a balanced language model;

processing the each of the audio records into a feature vector comprising the first speech recognition output, the second speech recognition output, and speech recognition output comprising a plurality of audio record feature vectors, selected from a group consisting of a number of silences, a noise threshold, a number of silences per second, a total duration of silence per second, an average amplitude in the audio record, a standard deviation of the amplitude in the audio record, a number of long silences per second, and a total duration of long silences per second;

creating a plurality of basic classifiers based on the feature vector and selected from a group consisting of language classifiers, decision tree classifiers, and k-nearest classifiers;

evaluating the each of the plurality of basic classifiers;

creating a plurality of boosted classifiers, the each of the plurality of boosted classifiers being a combination of the each of the plurality of basic classifiers;

testing the performance of the each of the boosted classifiers on a test set of training vectors, and determining which of the each of the boosted classifiers performed the best;

adding one of the plurality of basic classifiers to the boosted classifier, based on a first vector weight;

testing the performance of the boosted classifiers;

adjusting the first vector weight and testing the performance of the each of the boosted classifiers on the test set of training vectors, and determining which of the each of the boosted classifiers performs the best;

selecting a best boosted classifier; and

saving the best boosted classifier and supporting structures.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In at least one exemplary embodiment for automated document identification and language dictation recognition systems, the system comprises a database capable of receiving a plurality of verbal records, the verbal record comprising at least one identifier and at least one verbal feature and a processor operably coupled to the database, where the processor has and executes a software program. The processor being operational to identify a subset of the plurality of verbal records from the database, extract at least one verbal feature from the identified records, analyze the at least one verbal feature of the subset of the plurality of verbal records, process the subset of the plurality of records using the analyzed feature according to at least one reasoning approach, generate a processed verbal record using the processed subset of the plurality of records, and deliver the processed verbal record to a recipient. The processor being further operational to extract features for a pool of training documents, to turn each transcription job into a feature vector which can be used by a traditional classifier, creating classifiers with different parameters in order to explore the best possible strategy, evaluating performance of all classifiers, creating a boosting classifier, calculating performance statistics, and operating the automatic document identifier for all documents.

Citations

16 Claims

1. A computerized method for automated identification of audio records to improve a textual transcript, the method comprising the steps of:
- selecting a plurality of audio records from a database, the each of the plurality of audio records comprising supporting information;
  
  obtaining a first speech recognition output of the each of the plurality of audio records using a pooled language model;
  
  obtaining a second speech recognition output of the each of the plurality of audio records using a balanced language model;
  
  processing the each of the audio records into a feature vector comprising the first speech recognition output, the second speech recognition output, and speech recognition output comprising a plurality of audio record feature vectors, selected from a group consisting of a number of silences, a noise threshold, a number of silences per second, a total duration of silence per second, an average amplitude in the audio record, a standard deviation of the amplitude in the audio record, a number of long silences per second, and a total duration of long silences per second;
  
  creating a plurality of basic classifiers based on the feature vector and selected from a group consisting of language classifiers, decision tree classifiers, and k-nearest classifiers;
  
  evaluating the each of the plurality of basic classifiers;
  
  creating a plurality of boosted classifiers, the each of the plurality of boosted classifiers being a combination of the each of the plurality of basic classifiers;
  
  testing the performance of the each of the boosted classifiers on a test set of training vectors, and determining which of the each of the boosted classifiers performed the best;
  
  adding one of the plurality of basic classifiers to the boosted classifier, based on a first vector weight;
  
  testing the performance of the boosted classifiers;
  
  adjusting the first vector weight and testing the performance of the each of the boosted classifiers on the test set of training vectors, and determining which of the each of the boosted classifiers performs the best;
  
  selecting a best boosted classifier; and
  
  saving the best boosted classifier and supporting structures.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein the audio record feature vectors, is further selected from a group consisting of, a total duration of silences, a type of device used to dictate the audio record, a day the audio record was submitted, a time the audio record was submitted, a duration of the audio record.
  - 3. The method of claim 1, further comprising processing the plurality of records using a best boosted classifier.
  - 4. The method of claim 1, wherein the language classifier receives speech recognition input.
  - 5. The method of claim 1, wherein the decision tree classifier receives non-speech recognition features as input.
  - 6. The method of claim 1, wherein the k-nearest neighbor classifier receives non-speech recognition features, the features supporting the feature vector for distance calculation in a k-nearest algorithm.
  - 7. The method of claim 1, wherein evaluating the each of the plurality of basic classifiers, further comprises testing the each of the plurality of basic classifiers on a test set of training vectors.
  - 8. The method of claim 7, further comprising the step of taking into account any changing weights of the each of the basic classifiers and the corresponding training vectors.

9. A system for automated identification of audio records to improve a textual transcript, the system comprising:
- a database capable of receiving a plurality of audio records, the each of the plurality of audio records comprising supporting information;
  
  a processor operably coupled to the database, and configured to;
  
  obtain a first speech recognition output of the each of the plurality of audio records using a pooled language model;
  
  obtain a second speech recognition output of the each of the plurality of audio records using a balanced language model;
  
  process the each of the audio records into a feature vector comprising the first speech recognition output, the second speech recognition output, and speech recognition output comprising a plurality of audio record feature vectors, selected from a group consisting of a type of device used to dictate the audio record, a day the audio record was submitted, a time the audio record was submitted, a duration of the audio record, a number of silences, a noise threshold, a number of silences per second, a total duration of silence per second, an average amplitude in the audio record, a standard deviation of the amplitude in the audio record, a number of long silences per second, and a total duration of long silences per second;
  
  create a plurality of basic classifiers based at least in part on the feature vector;
  
  evaluate the each of the plurality of basic classifiers;
  
  create a plurality of boosted classifiers, the each of the plurality of boosted classifiers being a combination of the each of the plurality of basic classifiers;
  
  test the performance of the each of the boosted classifiers on a test set of training vectors, and determining which of the each of the boosted classifiers performed the best;
  
  add one of the plurality of basic classifiers to the boosted classifier, based on a first vector weight;
  
  test the performance of the boosted classifiers;
  
  adjust the first vector weight and testing the performance of the each of the boosted classifiers on the test set of training vectors, and determining which of the each of the boosted classifiers performs the best;
  
  select a best boosted classifier; and
  
  save the best boosted classifier and supporting structures.
- View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
- - 10. The system of claim 9, wherein the audio record feature vectors, is further selected from a group consisting of speech recognition output using a total duration of silences a type of device used to dictate the audio record, a day the audio record was submitted, a time the audio record was submitted, a duration of the audio record.
  - 11. The system of claim 10, wherein the processor is further configured to receive speech recognition input.
  - 12. The system of claim 10, wherein the processor is further configured to receive non-speech recognition features as input.
  - 13. The system of claim 10, wherein the processor is further configured to receive non-speech recognition features, the features supporting the feature vector for distance calculation in a k-nearest algorithm.
  - 14. The system of claim 9, wherein the processor is further configured to process the plurality of records using the best boosted classifier.
  - 15. The system of claim 9, wherein the processor is further configured to test the each of the plurality of basic classifiers on a test set of training vectors.
  - 16. The system of claim 9, wherein the processor is further configured to take into account any changing weights of the each of the basic classifiers and the corresponding training vectors.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
InfraWare, Inc.
Original Assignee
InfraWare, Inc.
Inventors
Lindle, Nathan, Mahurin, Nick
Primary Examiner(s)
Lerner, Martin

Application Number

US15/624,370
Publication Number

US 20170294190A1
Time in Patent Office

628 Days
Field of Search

704231, 704255, 704235, 704243, 706 12
US Class Current
CPC Class Codes

G01L 15/00   Devices or apparatus for me...

G06F 18/214   Generating training pattern...

G06F 18/24147   Distances to closest patter...

G06F 18/28   Determining representative ...

G06F 40/149   Adaptation of the text data...

G06F 40/205   Parsing

G06N 20/00   Machine learning

G10L 15/00   Speech recognition G10L17/0...

G10L 15/02   Feature extraction for spee...

G10L 15/1822   Parsing for meaning underst...

G10L 15/26   Speech to text systems G10L...

G10L 15/30   Distributed recognition, e....

G10L 2015/025   Phonemes, fenemes or fenone...

Automated identification of verbal records using boosted classifiers to improve a textual transcript

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

Citations

16 Claims

Specification

Solutions

Use Cases

Quick Links

Automated identification of verbal records using boosted classifiers to improve a textual transcript

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

16 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links