SYSTEM FOR GENERATING CAPTIONS FOR LIVE VIDEO BROADCASTS

US 20120316882A1
Filed: 06/10/2011
Published: 12/13/2012
Est. Priority Date: 06/10/2011
Status: Active Grant

First Claim

Patent Images

1. A method of performing distributed caption generation, the method comprising:

selecting first and second respeakers to perform respeaking with a voice recognition engine for a broadcast program based at least in part on past performance ratings of the first and second respeakers;

receiving first text generated by the first respeaker for inclusion in the broadcast program;

receiving second text generated by the second respeaker for inclusion in the broadcast program, wherein the second text is being received as backup in case receipt of the first text is interrupted;

outputting the first text for inclusion as captions in the broadcast program;

determining whether receipt of the first text is interrupted;

in response to determining that receipt of the first text is interrupted, outputting the second text for inclusion in the broadcast program; and

calculating new performance ratings for the first and second respeakers, the new performance ratings configured to be used to assign the first or second respeaker to a subsequent broadcast program;

wherein at least said determining is implemented by a computer system comprising computer hardware.

View all claims

35 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An adaptive workflow system can be used to implement captioning projects, such as projects for creating captions or subtitles for live and non-live broadcasts. Workers can repeat words spoken during a broadcast program or other program into a voice recognition system, which outputs text that may be used as captions or subtitles. The process of workers repeating these words to create such text can be referred to as respeaking. Respeaking can be used as an effective alternative to more expensive and hard-to-find stenographers for generating captions and subtitles.

81 Citations

View as Search Results

34 Claims

1. A method of performing distributed caption generation, the method comprising:
- selecting first and second respeakers to perform respeaking with a voice recognition engine for a broadcast program based at least in part on past performance ratings of the first and second respeakers;
  
  receiving first text generated by the first respeaker for inclusion in the broadcast program;
  
  receiving second text generated by the second respeaker for inclusion in the broadcast program, wherein the second text is being received as backup in case receipt of the first text is interrupted;
  
  outputting the first text for inclusion as captions in the broadcast program;
  
  determining whether receipt of the first text is interrupted;
  
  in response to determining that receipt of the first text is interrupted, outputting the second text for inclusion in the broadcast program; and
  
  calculating new performance ratings for the first and second respeakers, the new performance ratings configured to be used to assign the first or second respeaker to a subsequent broadcast program;
  
  wherein at least said determining is implemented by a computer system comprising computer hardware.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, wherein said determining that receipt of the first text is interrupted comprises pinging a computer system operated by the first respeaker.
  - 3. The method of claim 2, wherein a timeout in said pinging results in the determination that the first text is interrupted.
  - 4. The method of claim 1, wherein said calculating the new performance ratings comprises evaluating one or more of the following performance factors:
    - accuracy, timeliness, availability, infrastructure, rate, and professionalism.
  - 5. The method of claim 1, wherein said broadcast program is broadcast over the air, via cable, via satellite, and/or via a computer network.
  - 6. The method of claim 1, wherein said past performance ratings are based in part on an accuracy rating with respect to voice recognition performed by a voice recognition system for the first respeaker and the second respeaker.

7. A method of performing distributed caption generation, the method comprising:
- receiving first text generated by a first respeaker with a voice recognition engine for inclusion in an audio program;
  
  receiving second text generated by a second respeaker with the voice recognition engine for inclusion in the audio program, the second text being received as backup in case the first text is no longer received;
  
  outputting the first text for inclusion in the audio program;
  
  determining whether an interruption has occurred related to receipt of the first text; and
  
  in response to determining that the interruption has occurred, outputting the second text of the second respeaker for inclusion in the audio program;
  
  wherein at least said determining is implemented by a computer system comprising computer hardware.
- View Dependent Claims (8, 9, 10, 11)
- - 8. The method of claim 7, wherein said determining comprises pinging a computer system operated by the first respeaker.
  - 9. The method of claim 8, wherein a timeout on the ping results in the determination that the first text is no longer being received.
  - 10. The method of claim 7, further comprising providing audio for the audio program to the first and second respeakers via one or more of the following:
    - a network application, a voice-over IP (VoIP) application, or over a telephone line.
  - 11. The method of claim 7, further comprising saving the first and second text for subsequent review by the first and second respeakers, so as to enable the first and second respeakers to retrain the voice recognition engine.

12. A system for performing distributed caption generation, the system comprising:
- a project network application comprising a respeaking module configured to;
  
  provide functionality for first and second respeakers to generate text responsive to audio of a broadcast,receive first text generated by the first respeaker, andreceive second text generated by the second respeaker as backup in case the first text is no longer received; and
  
  a failover module comprising computer hardware, the failover module configured to;
  
  output the first text for inclusion in the broadcast,determining whether an interruption has occurred related to receipt of the first text, andin response to determining that the interruption has occurred related to receipt of the first text, output the second text of the second respeaker for inclusion in the broadcast.
- View Dependent Claims (13, 14)
- - 13. The system of claim 12, wherein the broadcast comprises one or more of the following:
    - a television broadcast, a streaming media broadcast, and a classroom broadcast.
  - 14. The system of claim 12, further comprising a project management module configured to provide one or both of the first and second text to a remote network application configured to output the text for presentation to users.

15. Non-transitory physical computer storage comprising instructions stored thereon for implementing, in one or more processors, operations for performing distributed caption generation, the operations comprising:
- receiving first text generated by a first respeaker with a voice recognition engine for inclusion in an audio program;
  
  receiving second text generated by a second respeaker with the voice recognition engine for inclusion in the audio program, the second text being received as backup in case the first text is no longer received;
  
  outputting the first text for inclusion in the audio program;
  
  determining whether the first text is no longer being received; and
  
  in response to determining that the first text is no longer being received, outputting the second text of the second respeaker for inclusion in the audio program.
- View Dependent Claims (16, 17)
- - 16. The non-transitory physical computer storage of claim 15, wherein said outputting the first text comprises providing the text to a provider of the audio program for inclusion as captions.
  - 17. The non-transitory physical computer storage of claim 15, in combination with a computer system comprising computer hardware.

18. A method of performing distributed caption generation, the method comprising:
- selecting a respeaker to perform respeaking with a voice recognition engine for a broadcast program based at least in part on a past performance rating of the respeaker;
  
  receiving text generated by the respeaker for inclusion in the broadcast program;
  
  outputting the text for inclusion in the broadcast program;
  
  calculating a new performance rating for the respeaker, the new performance rating configured to be used to evaluate whether to assign the respeaker to a subsequent broadcast program; and
  
  wherein at least said calculating is implemented by a computer system comprising computer hardware.
- View Dependent Claims (19, 20, 21, 22)
- - 19. The method of claim 18, wherein said calculating the new performance ratings comprises evaluating one or more of the following performance factors:
    - accuracy, timeliness, availability, infrastructure, rate, and professionalism.
  - 20. The method of claim 18, wherein said calculating the new performance rating comprises evaluating whether the respeaker corrects the text subsequent to the broadcast program.
  - 21. The method of claim 18, wherein said calculating the new performance rating comprises evaluating whether the respeaker trains the voice recognition engine with new words.
  - 22. The method of claim 18, wherein said calculating the new performance rating comprises evaluating whether a computing system of the respeaker reliably transmits the text during the broadcast program.

23. A system for performing distributed caption generation, the system comprising:
- a project management module configured to select a respeaker to perform respeaking with a voice recognition engine for a broadcast based at least in part on a past performance rating of the respeaker;
  
  a project network application comprising a respeaking module configured to;
  
  provide functionality for the respeaker to generate text responsive to audio from a broadcast,receive text generated by the respeaker, andoutput the text for inclusion in the broadcast; and
  
  a worker ratings calculator comprising computer hardware, the worker ratings calculator configured to calculate a new performance rating for the respeaker, the new performance rating configured to be used to evaluate whether to assign the respeaker to a subsequent broadcast program.
- View Dependent Claims (24)
- - 24. The system of claim 23, wherein the worker ratings calculator is further configured to calculate the new performance ratings by at least evaluating one or more of the following performance factors:
    - accuracy, timeliness, availability, infrastructure, rate, and professionalism.

25. Non-transitory physical computer storage comprising instructions stored thereon for implementing, in one or more processors, operations for performing distributed caption generation, the operations comprising:
- selecting a respeaker to perform respeaking with a voice recognition engine for a program based at least in part on a past performance rating of the respeaker;
  
  receiving text generated by the respeaker for inclusion in the program;
  
  outputting the text for inclusion in the program; and
  
  calculating a new performance rating for the respeaker, the new performance rating configured to be used to evaluate whether to assign the respeaker to a subsequent program.
- View Dependent Claims (26, 27, 28, 29, 30, 31)
- - 26. The non-transitory physical computer storage of claim 25, wherein said calculating the new performance ratings comprises evaluating one or more of the following performance factors:
    - accuracy, timeliness, availability, infrastructure, rate, and professionalism.
  - 27. The non-transitory physical computer storage of claim 25, wherein the program comprises a live broadcast.
  - 28. The non-transitory physical computer storage of claim 25, wherein the program comprises a video.
  - 29. The non-transitory physical computer storage of claim 25, wherein said outputting the text comprises supplying the text as subtitles for the program.
  - 30. The non-transitory physical computer storage of claim 25, wherein said outputting the text comprises supplying the text as captions for the program.
  - 31. The non-transitory physical computer storage of claim 25, in combination with a computer system comprising computer hardware.

32. Non-transitory physical computer storage comprising instructions stored thereon for implementing, in one or more processors, operations for performing distributed caption generation, the operations comprising:
- receiving speech audio from a respeaker user with a voice recognition engine, the voice recognition engine comprising a plurality of voice recognition systems, the speech audio corresponding to speech output by the respeaker user in order to transcribe broadcast audio;
  
  providing the speech audio to the plurality of voice recognition systems;
  
  receiving text output from each of the voice recognition systems;
  
  receiving a calculated probability of accuracy for the output text from each of the voice recognition systems; and
  
  selecting the output text from one of the voice recognition systems based on the calculated probability of accuracy.
- View Dependent Claims (33, 34)
- - 33. The non-transitory physical computer storage of claim 32, wherein said selecting comprises selecting the output text having the greatest probability of accuracy.
  - 34. The non-transitory physical computer storage of claim 32, in combination with a computer system comprising computer hardware.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Deluxe 3D LLC (Deluxe Corp.), Deluxe Digital Distribution Inc. (Deluxe Corp.), Deluxe Digital Studios, Inc. (Deluxe Corp.), Deluxe Entertainment Services Inc., Deluxe Laboratories LLC, Deluxe Media Inc., Deluxe One, LLC, Softitler Net, Inc., Deluxe Creative Services Inc. (The Framestore Ltd.)
Original Assignee
Morgan Fiumi
Inventors
Fiumi, Morgan

Granted Patent

US 9,026,446 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/270
CPC Class Codes

G10L 15/26   Speech to text systems G10L...

G10L 15/30   Distributed recognition, e....

G10L 15/32   Multiple recognisers used i...

SYSTEM FOR GENERATING CAPTIONS FOR LIVE VIDEO BROADCASTS

First Claim

35 Assignments

0 Petitions

Accused Products

Abstract

81 Citations

34 Claims

Specification

Solutions

Use Cases

Quick Links

SYSTEM FOR GENERATING CAPTIONS FOR LIVE VIDEO BROADCASTS

First Claim

35 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

81 Citations

34 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links