Method and Apparatus for Identifying Video Program Material or Content via Frequency Translation or Modulation Schemes
First Claim
1. A system for improving speech recognition of a speech to text converter comprising:
- a frequency translator for receiving an audio signal, wherein an output of the frequency translator shifts a spectrum of the audio signal up or down;
a band pass filter coupled to the output of the frequency translator, wherein the band pass filter provides one or more band limited spectrums of the audio signal;
a distortion generator coupled to an output of the band pass filter; and
a speech to text converter coupled to an output of the distortion generator to provide improved speech recognition of the audio signal.
10 Assignments
0 Petitions
Accused Products
Abstract
A system for identification of video content in a video signal is provided via a sound track audio signal. The audio signal is processed with filtering, frequency translation, and or non linear transformations to extract voice signals from the sound track channel. The extracted voice signals are coupled to a speech recognition system to provide in text form, the words of the video content, which is later compared with a reference library of words or dialog from known video programs or movies. Other attributes of the video signal or transport stream may be combined with closed caption data or closed caption text for identification purposes. Example attributes include DVS/SAP information, time code information, histograms, and or rendered video or pictures.
18 Citations
20 Claims
-
1. A system for improving speech recognition of a speech to text converter comprising:
-
a frequency translator for receiving an audio signal, wherein an output of the frequency translator shifts a spectrum of the audio signal up or down; a band pass filter coupled to the output of the frequency translator, wherein the band pass filter provides one or more band limited spectrums of the audio signal; a distortion generator coupled to an output of the band pass filter; and a speech to text converter coupled to an output of the distortion generator to provide improved speech recognition of the audio signal. - View Dependent Claims (2, 3, 4, 5, 6, 7, 19, 20)
-
-
8. A method of identifying video program material in a video signal comprising:
-
providing a database of DVS/SAP information or text of DVS/SAP information and providing a database of words from sound tracks; supplying the video signal to a processor/reader, wherein the reader provides processed DVS/SAP information or text of DVS/SAP information, and wherein the reader coupled to the video signal provides words from sound tracks via a filter bank, distortion generators, and a speech to text processor; and comparing the processed DVS/SAP information or the text of DVS/SAP information to the DVS/SAP information or the text of DVS/SAP information from the database, and comparing with the data base of words from sound tracks to the words of the sound track provided by the video signal to provide identification of the video program material. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A system for improving speech recognition of a speech to text converter comprising:
-
a band pass filter for receiving an audio signal, wherein the band pass filter provides a band limited spectrum of the audio signal; a frequency translator coupled to an output of the band pass filter, wherein an output of the frequency translator shifts the band limited spectrum of the audio up or down; and a speech to text converter coupled to the output of the frequency translator to provide improved speech recognition of the band limited spectrum of the audio signal. - View Dependent Claims (16, 17, 18)
-
Specification