Generation subtitles or captions for moving pictures
First Claim
1. A method of generating subtitles for audiovisual material, comprising the steps of:
- receiving and analysing a text file containing dialogue spoken in the audiovisual material to provide text information signal representative of the text;
aligning the text information and the audio signal from the audiovisual material in time using time alignment speech recognition to provide timing information for the spoken text; and
forming the text information and the timing information into an output subtitle file.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for generating subtitles for audiovisual material received and analyses a text file containing dialogue spoken in audiovisual material and provides a signal representative of the text. The text information and audio signal are aligned in time using time alignment speech recognition and the text and timing information are then output to a subtitle file. Colours can be assigned to different speakers or groups of speakers. Subtitles are derived by receiving and analyzing a text file containing dialogue spoken by considering each word in turn and the next information signal, assigning a score to each subtitle in a plurality of different possible subtitle formatting options which lead to that word. The steps are then repeated until all the words in the text information signal have been used and the subtitle formatting option which gives the best overall score is then derived.
-
Citations
31 Claims
-
1. A method of generating subtitles for audiovisual material, comprising the steps of:
-
receiving and analysing a text file containing dialogue spoken in the audiovisual material to provide text information signal representative of the text;
aligning the text information and the audio signal from the audiovisual material in time using time alignment speech recognition to provide timing information for the spoken text; and
forming the text information and the timing information into an output subtitle file. - View Dependent Claims (2, 3, 16, 17, 18, 20)
-
-
4. A method of assigning colour representative of different speakers to subtitles, the method comprising the steps of:
-
forming a plurality of groups of speakers, where each group contains speakers who can be represented by the same colour; and
assigning the available colours to a corresponding number of the plurality of groups, the groups being selected such that all the speakers are allocated a colour. - View Dependent Claims (5, 6, 7, 8)
-
-
9. A method of detecting scene changes in audiovisual material, comprising the steps of:
-
receiving signal representative of the spoken dialogue in the audiovisual material;
identifying the times when speakers are active in the spoken dialogue; and
detecting points in time in the spoken dialogue where the group of active speakers changes. - View Dependent Claims (10)
-
-
11. A method of parsing an electronic text file to identify different components thereof, comprising the steps of:
-
identifying blocks of text in an input electronic text file;
providing a plurality of possible script format properties for the blocks;
providing a definition of each of the possible components of the text file;
in relation to each block, determining the value of each script format property;
for each block, determining from the script format properties of the block and the component definitions a probability value that that block is each of the component types;
selecting the component type for each block on the basis of the probabilities that it is each of the component types; and
generating therefrom an output file. - View Dependent Claims (12, 13, 14)
-
-
15. A method of placing subtitles related to speech from two speakers in a picture, comprising the steps of:
-
generating separate subtitles for the two speakers;
determining from left and right stereo audio signals which of the two speakers is nearer the left and which nearer the right in the picture; and
placing the subtitles for the two speakers in accordance with the determination.
-
-
19. A method of generating subtitles for audiovisual material, comprising the steps of:
-
playing the audio signal from the audiovisual material, the audio signal containing speech;
having a person listen to the speech and speak it into a microphone;
applying the microphone output signal to a speech recogniser to provide an electronic text signal;
comparing the timings of the audio signal from the audiovisual material and the microphone output signal; and
adjusting the timing of the output of the speech recogniser in dependence upon the comparison so as to tend to align the output of the speech recogniser with the audio signal from the audiovisual material.
-
-
21. A method of placing subtitles related to speech from speakers in a moving picture, comprising the steps of:
-
receiving a video signal representative of the picture;
analysing the video signal to identify areas of the picture which indicate the presence of a speaker in a location on the picture;
generating therefrom a signal which indicates a desired location for a subtitle relating to speech spoken by that speaker; and
placing the subtitle for that speaker in accordance. therewith. - View Dependent Claims (22)
-
-
23. A method of generating subtitles for audiovisual material, comprising the steps of:
-
receiving a text signal containing text corresponding to speech in the audiovisual material;
identifying from the audio signal from the audiovisual material predetermined characteristics of the speakers voice;
determining when the characteristics change and, in response thereto, providing an output signal indicating a change of speaker; and
generating from the text signal and the output signal indicating a change of speaker subtitles related to the speech and to the speaker. - View Dependent Claims (24, 25, 26, 27, 28)
-
-
29. A method of generating subtitles for audiovisual material comprising the steps of:
-
receiving and analysing a text file containing dialogue spoken in the audiovisual material to provide a text information signal representative of the text;
deriving a set of subtitles from the text information signal;
characterised in that the deriving step comprises;
a) considering each word in turn in the text information signal;
b) assigning a score to each subtitle in a plurality of different possible subtitle formatting options leading to that word;
c) repeating steps a) and b) until all the words in the text information signal have been used; and
d) deriving the subtitle formatting option that gives the best overall score for the text information signal. - View Dependent Claims (30, 31)
-
Specification