×

Timeline Alignment for Closed-Caption Text Using Speech Recognition Transcripts

  • US 20110134321A1
  • Filed: 09/13/2010
  • Published: 06/09/2011
  • Est. Priority Date: 09/11/2009
  • Status: Active Grant
First Claim
Patent Images

1. A method of synchronizing text with audio in a multimedia file, wherein the multimedia file includes previously synchronized video and audio, wherein the multimedia file has a start time and a stop time that defines a timeline for the multimedia file, wherein the frames of the video and the corresponding audio are each associated with respective points in time along the timeline, comprising the steps of:

  • receiving the multimedia file and parsing the audio therefrom, but maintaining the timeline synchronization between the video and the audio;

    receiving closed-captioned data associated with the multimedia file, wherein the closed-captioned data contains closed-captioned text, wherein each word of the closed-captioned text is associated with a corresponding word spoken in the audio, wherein each word of the closed-captioned text has a high degree of accuracy with the corresponding word spoken in the audio but a low correlation with the respective point in time along the timeline at which the corresponding word was spoken in the audio;

    using automated speech recognition (ASR) software, generating ASR text of the parsed audio, wherein each word of the ASR text is associated approximately with the corresponding words spoken in the audio, wherein each word of the ASR text has a lower degree of accuracy with the corresponding words spoken in the audio than the respective words of the closed-captioned text but a high correlation with the respective point in time along the timeline at which the corresponding word was spoken in the audio;

    thereafter, using N-gram analysis, comparing each word of the closed-captioned text with a plurality of words of the ASR text until a match is found;

    for each matched word from the closed-captioned text, associating therewith the respective point in time along the timeline of the matched word from the ASR text corresponding therewith, whereby each closed-captioned word is associated with a respective point on the timeline corresponding to the same point in time on the timeline in which the word is actually spoken in the audio and occurs within the video.

View all claims
  • 13 Assignments
Timeline View
Assignment View
    ×
    ×