Method for refining time alignments of closed captions

US 6,442,518 B1
Filed: 07/14/1999
Issued: 08/27/2002
Est. Priority Date: 07/14/1999
Status: Expired due to Fees

First Claim

Patent Images

1. An apparatus for automatically aligning closed captions, comprising:

an audio classifier unit, for receiving audio data and identifying portions of the audio data that comprise speech data;

a speech rate control unit, coupled to the audio classifier unit for outputting the portions of the audio data that include speech, adjusts the speech data rate such that an operator can more easily perform transcription;

a time event tracker unit, coupled to the audio speed control unit, for receiving the speech portions of the audio data and for receiving a transcription of the speech portions of the audio data that is generated by the operator, the time event tracker unit also for inserting time stamps in the transcription that indicate the time when portions of the transcription were generated by the operator and the time stamped transcription being output as a roughly aligned closed caption stream; and

a re-aligner unit for precisely aligning the roughly aligned closed caption stream in a non-recursive manner, wherein the re-aligner unit comprises a captions re-aligner unit that receives the roughly aligned closed caption stream and the associated audio data stream and segments both streams into sections based upon a threshold duration between the time stamps in the roughly aligned closed caption stream, the captions re-aligner unit also breaking each section into a number of chunks and generating a language model using only the words contained in each chunk, the captions re-aligner unit using the language model to perform a speech recognition operation on the audio data stream and to generate a hypothesized word list, a plurality of time stamps in the roughly aligned closed caption stream being modified to aligning with a plurality of time stamps in the hypothesized word list.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and apparatus are provided for refining time alignments of closed captions. The method automatically aligns closed caption data with associated audio data such that the closed caption data can be more precisely indexed to a requested keyword by a search engine. Further, with such a structure, the closed captions can be made to appear and disappear on a display screen in direct relation to the associated spoken words and phrases. Accordingly, hearing impaired viewers can more easily understand the program that is being displayed.

288 Citations

9 Claims

1. An apparatus for automatically aligning closed captions, comprising:
- an audio classifier unit, for receiving audio data and identifying portions of the audio data that comprise speech data;
  
  a speech rate control unit, coupled to the audio classifier unit for outputting the portions of the audio data that include speech, adjusts the speech data rate such that an operator can more easily perform transcription;
  
  a time event tracker unit, coupled to the audio speed control unit, for receiving the speech portions of the audio data and for receiving a transcription of the speech portions of the audio data that is generated by the operator, the time event tracker unit also for inserting time stamps in the transcription that indicate the time when portions of the transcription were generated by the operator and the time stamped transcription being output as a roughly aligned closed caption stream; and
  
  a re-aligner unit for precisely aligning the roughly aligned closed caption stream in a non-recursive manner, wherein the re-aligner unit comprises a captions re-aligner unit that receives the roughly aligned closed caption stream and the associated audio data stream and segments both streams into sections based upon a threshold duration between the time stamps in the roughly aligned closed caption stream, the captions re-aligner unit also breaking each section into a number of chunks and generating a language model using only the words contained in each chunk, the captions re-aligner unit using the language model to perform a speech recognition operation on the audio data stream and to generate a hypothesized word list, a plurality of time stamps in the roughly aligned closed caption stream being modified to aligning with a plurality of time stamps in the hypothesized word list.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The apparatus for automatically aligning closed captions, as described in claim 1, wherein the plurality of time stamps in the roughly aligned closed caption stream were determined in response to a transcription of a plurality of words spoken in the audio data stream.
  - 3. The apparatus for automatically aligning closed captions, as described in claim 2, wherein the plurality of time stamps in the hypothesized word list correspond to words in a chunk.
  - 4. The apparatus for automatically aligning closed captions, as described in claim 3, wherein the captions re-aligner unit further determines whether the hypothesized word list contains the same words as the corresponding chunk of the roughly aligned closed caption stream and performs a recursive alignment operation on the portions of the chunk that are different.
  - 5. The apparatus for automatically aligning closed captions, as described in claim 1, further comprising:

6. A computer system, comprising:
- a central processing unit connected to a memory system by a system bus;
  
  an I/O controller, connected to the central processing unit and to the memory system by the system bus;
  
  an audio classifier application, executed by the central processing unit, for receiving audio data and identifying portions of the audio data that comprise speech data;
  
  a speech rate control application, executed by the central processing unit, for outputting the portions of the audio data that include speech at a predetermined rate;
  
  a time event tracker application, executed by the central processing unit, for receiving the speech portions of the audio data from the speech rate control application and for receiving a transcription of the speech portions of the audio data, the time event tracker applications also for inserting time stamps in the transcription that indicate the time when portions of the transcription were received and the time stamped transcription being output as a roughly aligned closed caption stream; and
  
  a re-aligner application, executed by the central processing unit, for precisely aligning the roughly aligned closed caption stream, wherein the re-aligner application comprises a captions re-aligner portion that receives the roughly aligned closed caption stream and the associated audio data stream and segments both streams into sections based upon a threshold duration between time stamps in the roughly aligned closed caption stream, the captions realigner portion also breaking each section into a number of chunks and generating a language model using only the words contained in each chunk, the captions re-aligner portion using the language model to perform a speech recognition operation on the audio data stream and to generate a hypothesized word list, a plurality of time stamps in the roughly aligned closed caption stream being modified to align with a plurality of time stamps in the hypothesized word list.
- View Dependent Claims (7, 8, 9)
- - 7. The computer system described in claim 6, wherein the plurality of time stamps in the roughly aligned closed caption stream were determined in response to a transcription of a plurality of words spoken in the audio data stream.
  - 8. The computer system described in claim 7, wherein the plurality of time stamps in the hypothesized word list correspond to a first word of a chunk and correspond to a last word of a chunk.
  - 9. The computer system described in claim 7, wherein the captions re-aligner portion further determines whether the hypothesized word list contains the same words as the corresponding chunk of the roughly aligned closed caption stream and performs a recursive alignment operation on the portions of the chunk that are different.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Hewlett-Packard Development Company, L.P. (HP Inc.)
Original Assignee
Compaq Computer Corporation (HP Inc.)
Inventors
Moreno, Pedro, Van Thong, Jean-Manuel
Primary Examiner(s)
Banks-Harold, Marsha D.
Assistant Examiner(s)
Azad, Abul K.

Application Number

US09/353,729
Time in Patent Office

1,140 Days
Field of Search

704/235, 704/241, 704/270, 704/267, 704/271, 704/260, 704/263
US Class Current

704/235
CPC Class Codes

G10L 15/26 Speech to text systems G10L...

Method for refining time alignments of closed captions

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

288 Citations

9 Claims

Specification

Solutions

Use Cases

Quick Links

Method for refining time alignments of closed captions

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

288 Citations

9 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links