Common phrase identification and language dictation recognition systems and methods for using the same

US 10,102,860 B2
Filed: 06/15/2017
Issued: 10/16/2018
Est. Priority Date: 10/05/2010
Status: Active Grant

First Claim

Patent Images

1. A computerized method for analyzing verbal records to improve a textual transcript, the method comprising the steps of:

identifying a training set and a test set of transcribed verbal records, the training set comprising a first subset of a plurality of transcribed verbal records, and the test set comprising a different second subset of the plurality of transcribed verbal records;

for the each transcribed verbal record in the training set, determine a plurality of possible common phrases comprising a plurality of sequences of words appearing in the each verbal record in the training set, the each of the plurality of possible common phrases further having a minimum word length;

for each of the plurality of possible common phrases, determine a best parameter;

for each of the plurality of possible common phrases, finding a phrase accuracy based at least in part on a test for false positives;

saving the best parameter for the each of the plurality of possible common phrases; and

,applying the each of the plurality of possible common phrases to the transcribed verbal records, using the phrase accuracy, to create the textual transcript.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In at least one exemplary embodiment for common phrase identification and language dictation recognition systems and methods for using the same, the system comprises a database capable of receiving a plurality of verbal records, the verbal record comprising at least one identifier and at least one verbal feature and a processor operably coupled to the database, where the processor has and executes a software program. The processor being operational to identify a subset of the plurality of verbal records from the database, extract at least one verbal feature from the identified records, analyze the at least one verbal feature of the subset of the plurality of verbal records, process the subset of the plurality of records using the analyzed feature according to at least one reasoning approach, generate a processed verbal record using the processed subset of the plurality of records, and deliver the processed verbal record to a recipient. The processor being further operational to identify common phrases in parts of the verbal record, identifying a body of work for building a set of common phrases, analyze documents in a training set to find some common phrases, and replacing phrases with the common phrases.

15 Citations

23 Claims

1. A computerized method for analyzing verbal records to improve a textual transcript, the method comprising the steps of:
- identifying a training set and a test set of transcribed verbal records, the training set comprising a first subset of a plurality of transcribed verbal records, and the test set comprising a different second subset of the plurality of transcribed verbal records;
  
  for the each transcribed verbal record in the training set, determine a plurality of possible common phrases comprising a plurality of sequences of words appearing in the each verbal record in the training set, the each of the plurality of possible common phrases further having a minimum word length;
  
  for each of the plurality of possible common phrases, determine a best parameter;
  
  for each of the plurality of possible common phrases, finding a phrase accuracy based at least in part on a test for false positives;
  
  saving the best parameter for the each of the plurality of possible common phrases; and
  
  ,applying the each of the plurality of possible common phrases to the transcribed verbal records, using the phrase accuracy, to create the textual transcript.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The method of claim 1, wherein the plurality of sequences of words appears in a minimum percentage of the each verbal record in the training set.
  - 3. The method of claim 2, wherein the sequences of words is at least five words long.
  - 4. The method of claim 1, wherein the step of determining a plurality of possible common phrases further comprises finding alternative phrases for each of the plurality of possible common phrases.
  - 5. The method of claim 4, wherein finding alternative phrases further comprises comparing a first possible common phrases, and a second possible common phrase to determine if there is overlap there between.
  - 6. The method of claim 1, wherein the test for false positives comprises finding a soft match for each of the plurality of common phrases, the soft match further comprising a confidence score.
  - 7. The method of claim 6, wherein finding a soft match further comprises testing out each of the plurality of common phrases against each of the transcribed verbal records in the training set.
  - 8. The method of claim 1, wherein determining a best parameter comprises:
    - selecting a plurality of trial parameters, the plurality of trial parameters selected from a group consisting of maximum words, minimum percentage, and false positive rate;
      
      modifying the each of the transcribed verbal records in the test set using the each of the plurality of common phrases;
      
      calculating the performance of the each of the plurality of common phrases based on the each of the plurality of trial parameters; and
      
      selecting a best parameter, the best parameter being the trial parameter with the best performance.
  - 9. The method of claim 8, wherein calculating the performance comprises finding the number transcription errors in the each of the transcribed verbal records in the test set.
  - 10. The method of claim 1, wherein saving the best parameter comprises saving the each of the plurality of common phrases with the corresponding best parameter.
  - 11. The method of claim 1, wherein creating the textual transcript comprises:
    - selecting a plurality of verbal records from a database, wherein each verbal record in the plurality of verbal records comprises a combination of at least one identifier, at least one text feature, and at least one phonetic feature;
      
      identifying a subset of the plurality of verbal records from the database;
      
      modifying the identified subset of the plurality of verbal records to create a modified verbal record, wherein the system uses the at least one analyzed phonetic feature, the at least one text feature, and the each of the plurality of common phrases with the corresponding best parameter, or a combination thereof, to generate the modified verbal record; and
      
      delivering the modified verbal record to a recipient.

12. A system for analyzing verbal records to improve a textual transcript, the system comprising:
- a database configured to receive a plurality of transcribed verbal records;
  
  a processor operably connected to the database, and configured to;
  
  identify a training set and a test set of the transcribed verbal records, the training set comprising a first subset of the plurality of transcribed verbal records, and the test set comprising a different second subset of the plurality of transcribed verbal records;
  
  determine a plurality of possible common phrases for the each verbal record in the training set, the plurality of possible common phrases comprising a plurality of sequences of words appearing in the each verbal record in the training set, the each of the plurality of possible common phrases further having a minimum word length;
  
  determine a best parameter for each of the plurality of possible common phrases;
  
  determine a phrase accuracy based at least in part on a test for false positives;
  
  save the best parameter for the each of the plurality of possible common phrases; and
  
  ,apply the each of the plurality of possible common phrases to the transcribed verbal records, using the phrase accuracy to create the textual transcript.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
- - 13. The system of claim 12, wherein the processor is further configured to find a plurality of sequences of words appearing a minimum percentage the each verbal record in the training set.
  - 14. The system of claim 13, wherein the sequences of words is at least five words long.
  - 15. The system of claim 12, wherein the processor is further configured to find alternative phrases for each of the plurality of possible common phrases.
  - 16. The system of claim 15, wherein the processor is further configured to compare a first possible common phrases, and a second possible common phrase to determine if there is overlap there between.
  - 17. The system of claim 12, wherein the processor is further configured to perform a test for false positives for each of the plurality of possible common phrases.
  - 18. The system of claim 17, wherein the processor is further configured to find a soft match for each of the plurality of common phrases.
  - 19. The system of claim 18, wherein the processor is further configured to test out each of the plurality of common phrases against each of the transcribed verbal records in the training set.
  - 20. The system of claim 12, wherein the processor is further configured to:
    - select a plurality of trial parameters, the plurality of trial parameters selected from a group consisting of maximum words, minimum percentage, and false positive rate;
      
      modify the each of the transcribed verbal records in the test set using the each of the plurality of common phrases;
      
      calculate the performance of the each of the plurality of common phrases based on the each of the plurality of trial parameters; and
      
      select a best parameter, the best parameter being the trial parameter with the best performance.
  - 21. The system of claim 20, wherein the processor is further configured to find the number transcription errors in the each of the transcribed verbal records in the test set.
  - 22. The system of claim 12, wherein the processor is further configured to save the each of the plurality of common phrases with the corresponding best parameter.
  - 23. The system of claim 12, wherein the processor is further configured to:
    - select a plurality of verbal records from a database, wherein each verbal record in the plurality of verbal records comprises a combination of at least one identifier, at least one text feature, and at least one phonetic feature;
      
      identify a subset of the plurality of verbal records from the database;
      
      modify the identified subset of the plurality of verbal records to create a modified verbal record, wherein the system uses the at least one analyzed phonetic feature, the at least one text feature, and the each of the plurality of common phrases with the corresponding best parameter, or a combination thereof, to generate the modified verbal record; and
      
      deliver the modified verbal record to a recipient.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
InfraWare, Inc.
Original Assignee
InfraWare, Inc.
Inventors
Lindle, Nathan, Mahurin, Nick
Primary Examiner(s)
Lerner, Martin

Application Number

US15/624,411
Publication Number

US 20170286393A1
Time in Patent Office

488 Days
Field of Search

704231, 704235, 704243, 704251, 704255, 704 10
US Class Current
CPC Class Codes

G01L 15/00   Devices or apparatus for me...

G06F 40/149   Adaptation of the text data...

G06F 40/205   Parsing

G10L 15/00   Speech recognition G10L17/0...

G10L 15/02   Feature extraction for spee...

G10L 15/1822   Parsing for meaning underst...

G10L 15/183   using context dependencies,...

G10L 15/26   Speech to text systems G10L...

G10L 15/30   Distributed recognition, e....

G10L 2015/025   Phonemes, fenemes or fenone...

Common phrase identification and language dictation recognition systems and methods for using the same

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

15 Citations

23 Claims

Specification

Solutions

Use Cases

Quick Links

Common phrase identification and language dictation recognition systems and methods for using the same

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

15 Citations

23 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links