Training a transcription system

US 9,009,040 B2
Filed: 05/05/2010
Issued: 04/14/2015
Est. Priority Date: 05/05/2010
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

accessing recorded voice data of a user from one or more sources, the recorded voice data comprising a plurality of voice samples;

accessing a transcript of the recorded voice data, the transcript comprising text representing one or more words of each voice sample;

identifying an origin of a voice sample, the origin being a device used to input the voice sample;

determining that the origin is associated with the user;

determining that the voice sample matches a voice profile of the user, wherein the voice profile comprises voice signal characteristics to identify the voice of the user and user speech information to convert the voice sample to corresponding text and;

providing electronic mail and a text message generated by the user to identify one or more words commonly used by the user, the transcript, and the recorded voice data to a transcription system to generate an updated voice profile for the user;

determining portions of the transcript that are transcribed at a low confidence of accuracy;

flagging the portions of the transcript that are transcribed at a low confidence of accuracy; and

communicating the flagged portions of the transcript to a transcript refiner.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

According to certain embodiments, training a transcription system includes accessing recorded voice data of a user from one or more sources. The recorded voice data comprises voice samples. A transcript of the recorded voice data is accessed. The transcript comprises text representing one or more words of each voice sample. The transcript and the recorded voice data are provided to a transcription system to generate a voice profile for the user. The voice profile comprises information used to convert a voice sample to corresponding text.

Citations

15 Claims

1. A method comprising:
- accessing recorded voice data of a user from one or more sources, the recorded voice data comprising a plurality of voice samples;
  
  accessing a transcript of the recorded voice data, the transcript comprising text representing one or more words of each voice sample;
  
  identifying an origin of a voice sample, the origin being a device used to input the voice sample;
  
  determining that the origin is associated with the user;
  
  determining that the voice sample matches a voice profile of the user, wherein the voice profile comprises voice signal characteristics to identify the voice of the user and user speech information to convert the voice sample to corresponding text and;
  
  providing electronic mail and a text message generated by the user to identify one or more words commonly used by the user, the transcript, and the recorded voice data to a transcription system to generate an updated voice profile for the user;
  
  determining portions of the transcript that are transcribed at a low confidence of accuracy;
  
  flagging the portions of the transcript that are transcribed at a low confidence of accuracy; and
  
  communicating the flagged portions of the transcript to a transcript refiner.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, further comprising:
    - adding the voice sample to the recorded voice data.
  - 3. The method of claim 1, further comprising:
    - determining that the voice sample records the voice of the user.
  - 4. The method of claim 1, further comprising:
    - refining, by the transcript refiner, the transcript to yield a more accurate transcription of the recorded voice data.
  - 5. The method of claim 1, further comprising:
    - disabling the providing the transcript and the recorded voice data to the transcription system.
  - 6. The method of claim 1, the recorded voice data comprising at least one of the following types of voice data:
    - a voicemail;
      
      a video;
      
      a recorded call;
      
      a recorded conference call;
      
      ora recorded voice.

7. One or more non-transitory computer readable media storing one or more instructions, when executed by one or more processors, configured to:
- access recorded voice data of a user from one or more sources, the recorded voice data comprising a plurality of voice samples;
  
  access a transcript of the recorded voice data, the transcript comprising text representing one or more words of each voice sample;
  
  identify an origin of a voice sample, the origin being a device used to input the voice sample;
  
  determine that the origin is associated with the user;
  
  determine that the voice sample matches a voice profile of the user, wherein the voice profile comprises voice signal characteristics to identify a voice of the user and user speech information to convert the voice sample to corresponding text;
  
  provide electronic mail and a text message generated by the user to identify one or more words commonly used by the user, the transcript, and the recorded voice data to a transcription system to generate an updated voice profile for the user;
  
  determine portions of the transcript that are transcribed at a low confidence of accuracy;
  
  flag the portions of the transcript that are transcribed at a low confidence of accuracy; and
  
  communicate the flagged portions of the transcript to a transcript refiner.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The media of claim 7, the instructions configured to:
    - add the voice sample to the recorded voice data.
  - 9. The media of claim 7, the instructions configured to:
    - determine that the voice sample records the voice of the user.
  - 10. The media of claim 7, the instructions configured to:
    - refine, by the transcript refiner, the transcript to yield a more accurate transcription of the recorded voice data.
  - 11. The media of claim 7, the instructions configured to:
    - disable the providing the transcript and the recorded voice data to the transcription system.
  - 12. The media of claim 7, the recorded voice data comprising at least one of the following types of voice data:
    - a voicemail;
      
      a video;
      
      a recorded call;
      
      a recorded conference call;
      
      ora recorded voice.

13. An apparatus comprising:
- a memory configured to store computer executable instructions; and
  
  one or more processors coupled to the memory, the processors configured, when executing the instructions, to;
  
  access recorded voice data of a user from one or more sources, the recorded voice data comprising a plurality of voice samples;
  
  access a transcript of the recorded voice data, the transcript comprising text representing one or more words of each voice sample;
  
  identify an origin of a voice sample, the origin being a device used to input the voice sample;
  
  determine that the origin is associated with the user;
  
  determine that the voice sample matches a voice profile of the user, wherein the voice profile comprises voice signal characteristics to identify a voice of the user and user speech information to convert the voice sample to corresponding text;
  
  provide electronic mail and a text message generated by the user to identify one or more words commonly used by the user, the transcript, and the recorded voice data to a transcription system to generate an updated voice profile for the user;
  
  determine portions of the transcript that are transcribed at a low confidence of accuracy; and
  
  flag the portions of the transcript that are transcribed at a low confidence of accuracy; and
  
  communicate the flagged portions of the transcript to a transcript refiner.
- View Dependent Claims (14, 15)
- - 14. The apparatus of claim 13, the processors configured to:
    - add the voice sample to the recorded voice data.
  - 15. The apparatus of claim 13, the processors configured to:
    - refine, by the transcript refiner, the transcript to yield a more accurate transcription of the recorded voice data.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Cisco Technology, Inc. (Cisco Systems, Inc.)
Original Assignee
Cisco Technology, Inc. (Cisco Systems, Inc.)
Inventors
Tatum, Todd C., Ramalho, Michael A., Dunn, Paul M., Sarkar, Shantanu, Thorsen, Tyrone T., Gatzke, Alan D.
Primary Examiner(s)
Desir, Pierre-Louis
Assistant Examiner(s)
Sirjani, Fariba

Application Number

US12/774,054
Publication Number

US 20110276325A1
Time in Patent Office

1,805 Days
Field of Search

704/235
US Class Current

704/235
CPC Class Codes

G10L 15/07 to the speaker

G10L 2015/0638 Interactive procedures

Training a transcription system

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Training a transcription system

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links