Automated training of a user audio profile using transcribed medical record recordings

US 9,472,186 B1
Filed: 01/20/2015
Issued: 10/18/2016
Est. Priority Date: 01/28/2014
Status: Active Grant

First Claim

Patent Images

1. A method performed on at least one processor for training a user audio profile without requiring a user to read known text, comprising the steps of:

selecting a pre-recorded big audio file;

automatically generating a plurality of little audio files from the pre-recorded big audio file;

obtaining a big text file corresponding to the big audio file;

generating a plurality of little text files using the big text file and endpointing metadata, wherein there is one little text file for each of the plurality of little audio files;

creating a plurality of audio-text training pairs by linking each one of the plurality of little audio files with the corresponding one of the plurality of little text files;

selecting at least one of the plurality of audio-text training pairs to train the user audio profile; and

transmitting the at least one of the plurality of audio-text training pairs to a speech to text engine to train the user audio profile.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An automated system to build a user audio profile for a natural or continuous language speech to text dictation/transcription system is provided. The system uses previously recorded audio files that may have been already transcribed. The previously recorded audio file is split into a plurality of smaller audio files of about 15 seconds in length. The plurality of smaller audio files are matched to the transcribed text (e.g., small text files) or the smaller audio files are transcribed. All, some, or a selection of the small audio files and the small text files are linked as a training pair. The training pair may be edited in certain embodiments herein, both the text and the audio. The training pairs are submitted to the server to build the initial user audio profile.

Citations

19 Claims

1. A method performed on at least one processor for training a user audio profile without requiring a user to read known text, comprising the steps of:
- selecting a pre-recorded big audio file;
  
  automatically generating a plurality of little audio files from the pre-recorded big audio file;
  
  obtaining a big text file corresponding to the big audio file;
  
  generating a plurality of little text files using the big text file and endpointing metadata, wherein there is one little text file for each of the plurality of little audio files;
  
  creating a plurality of audio-text training pairs by linking each one of the plurality of little audio files with the corresponding one of the plurality of little text files;
  
  selecting at least one of the plurality of audio-text training pairs to train the user audio profile; and
  
  transmitting the at least one of the plurality of audio-text training pairs to a speech to text engine to train the user audio profile.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
- - 2. The method of claim 1 further comprising creating a user audio profile prior to transmitting the at least one of the plurality of audio-text training pairs, wherein the created user audio profile comprises a default profile.
  - 3. The method of claim 1 further comprising:
    - submitting the selected big audio file to the speech to text engine;
      
      creating the big text file from the big audio file and endpointing metadata; and
      
      splitting the big audio file into the plurality of little audio files using the endpointing metadata.
  - 4. The method of claim 1 wherein the step of generating the plurality of little text files comprises using the big text file, a truth text file, and the endpointing metadata.
  - 5. The method of claim 1 wherein the step of generating the plurality of little audio files from the big audio file comprises using an audiotimer to split the big audio file at predetermined intervals.
  - 6. The method of claim 5 wherein the step of generating the plurality of little text files comprises creating a big truth text from the big audio file and splitting the big truth text at a predetermined word count.
  - 7. The method of claim 1 wherein the plurality of audio-text pairs are linked at least by the endpointing metadata.
  - 8. The method of claim 1 wherein the plurality of audio-text pairs are linked by at least one of tagging or indexing the plurality of little audio files and the plurality of little text files.
  - 9. The method of claim 1 further comprising the step of displaying the plurality of audio-text pairs prior to transmitting the plurality of audio-text pairs to the speech to text engine for training.
  - 10. The method of claim 9 further comprising editing the text of at least one of the plurality of audio-text pairs.
  - 11. The method of claim 9 further comprising editing the audio of at least one of the plurality of audio-text pairs.
  - 12. The method of claim 9 further comprising editing the text of at least one of the plurality of audio-text pairs and editing the audio of at least one of the plurality of audio-text pairs.
  - 13. The method of claim 9 further comprising editing the text and the audio of at least one of the plurality of audio-text pairs.

14. An apparatus comprising:
- a processor, wherein the processor is operatively coupled to a speech to text engine;
  
  a memory operatively coupled to the processor; and
  
  a display operatively coupled to the processor and the memory;
  
  wherein the memory is configured to store audio and text files,wherein the processor is configured to fetch a big audio file from the memory and create a plurality of little audio files,wherein the processor is configured to obtain a big text file corresponding to the big audio file,wherein the processor is configured to generate a plurality of little text files using the big text file and endpointing metadata,wherein the processor is configured to link the plurality of little audio files and the plurality of little text files to create a plurality of audio-text training pairs that are displayed on the display, andwherein the processor is configured to transmit the plurality of audio-text training pairs to the speech to text engine for training a user audio profile.
- View Dependent Claims (15, 16, 17)
- - 15. The apparatus of claim 14 further comprising a text editor operatively coupled to the processor, wherein the text editor is configured to edit the text of the plurality of audio-text training pairs.
  - 16. The apparatus of claim 14 further comprising an audio editor operatively coupled to the processor, wherein the audio editor is configured to edit the audio of the plurality of audio-text training pairs.
  - 17. The apparatus of claim 14 further comprising a text editor and an audio editor operatively coupled to the processor, wherein the text editor is configured to edit the text of the plurality of audio-text training pairs and the audio editor is configured to edit the audio of the plurality of audio-text training pairs.

18. A non-transitory computer program product storable in a memory and executable by a computer comprising a computer usable medium including computer readable code embodied therein for processing data to allow training of a user audio profile, the computer usable medium comprising:
- code adapted to be executed by a processor configured to select a big audio file;
  
  code adapted to be executed by a processor configured to generate a plurality of little audio files from the big audio file;
  
  code adapted to be executed by a processor configured to obtain a big text file corresponding to the big audio file;
  
  code adapted to be executed by a processor configured to generate a plurality of little text files using the big text file and endpointing metadata, wherein there is one little text file for each of the plurality of little audio files;
  
  code adapted to be executed by a processor configured to create a plurality of audio-text training pairs by linking the plurality of little audio files with the corresponding plurality of little text files;
  
  code adapted to be executed by a processor configured to display the plurality of audio-text training pairs;
  
  code adapted to be executed by a processor configured to select at least one of the plurality of audio-text training pairs to train the user audio profile; and
  
  code adapted to be executed by a processor configured to transmit the at least one of the plurality of audio-text training pairs to a speech to text engine to train the user audio profile.
- View Dependent Claims (19)
- - 19. The computer program product of claim 18 comprising code adapted to be executed by a processor configured to edit at least the text of the audio-text training pairs displayed.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
nVoq, Inc.
Original Assignee
nVoq, Inc.
Inventors
Clark, Michael
Primary Examiner(s)
Pullias, Jesse

Application Number

US14/600,794
Time in Patent Office

637 Days
Field of Search

704231-257, 704270-275
US Class Current

1/1
CPC Class Codes

G10L 15/063   Training

G10L 15/26   Speech to text systems G10L...

G16H 15/00   ICT specially adapted for m...

Automated training of a user audio profile using transcribed medical record recordings

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

Automated training of a user audio profile using transcribed medical record recordings

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links