Automatic detection and segmentation of music videos in an audio/video stream
First Claim
1. A system for automatically detecting and segmenting music videos in an audio-video media stream, comprising:
- preprocessing a media stream using a plurality of boundary detection methods to locate a plurality of potential music video boundaries within the media stream;
integrating the potential music video boundaries to identify one or more of the potential music video boundaries that are not likely to be a part of an actual music video;
eliminating from further consideration the identified potential music video boundaries that are not likely to be a part of an actual music video; and
analyzing the content of segments of the media stream between any remaining potential music video boundaries to determine whether the content between any two or more potential music video boundaries represents an actual music video.
2 Assignments
0 Petitions
Accused Products
Abstract
A “music video parser” automatically detects and segments music videos in a combined audio-video media stream. Automatic detection and segmentation is achieved by integrating shot boundary detection, video text detection and audio analysis to automatically detect temporal boundaries of each music video in the media stream. In one embodiment, song identification information, such as, for example, a song name, artist name, album name, etc., is automatically extracted from the media stream using video optical character recognition (OCR). This information is then used in alternate embodiments for cataloging, indexing and selecting particular music videos, and in maintaining statistics such as the times particular music videos were played, and the number of times each music video was played.
-
Citations
24 Claims
-
1. A system for automatically detecting and segmenting music videos in an audio-video media stream, comprising:
-
preprocessing a media stream using a plurality of boundary detection methods to locate a plurality of potential music video boundaries within the media stream;
integrating the potential music video boundaries to identify one or more of the potential music video boundaries that are not likely to be a part of an actual music video;
eliminating from further consideration the identified potential music video boundaries that are not likely to be a part of an actual music video; and
analyzing the content of segments of the media stream between any remaining potential music video boundaries to determine whether the content between any two or more potential music video boundaries represents an actual music video. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer-implemented process for automatically parsing music videos in an audio-video media stream, comprising:
-
preprocessing a media stream using a plurality of boundary detection methods including one or more of shot detection, black screen detection, audio boundary detection, and video text detection to locate a plurality of potential music video boundaries within the media stream;
combining the potential music video boundaries to identify one or more of the potential music video boundaries that are likely to be a part of an actual music video;
eliminating any potential music video boundaries from further consideration that are not identified as being likely to be a part of an actual music video; and
analyzing the content of segments of the media stream between any two or more remaining potential music video boundaries to determine whether the content between any two or more potential music video boundaries represents an actual music video. - View Dependent Claims (13, 14, 15, 16)
-
-
17. A computer-readable medium having computer executable instructions for automatically extracting endpoint information for music videos embedded in an audio-video media stream, comprising:
-
analyzing a media stream using one or more boundary detection methods including shot detection, black screen detection, audio boundary detection, and video text detection to locate a plurality of potential music video boundaries within the media stream;
combining the potential music video boundaries to identify one or more of the potential music video boundaries that are likely to be a part of an actual music video and eliminating any potential music video boundaries from further consideration that are identified as not being likely to be a part of an actual music video;
performing audio discrimination of the media stream to eliminate one or more of the remaining potential music video boundaries where the eliminated potential boundaries are determined to represent portions of the media stream that are not likely to be a part of an actual music video; and
analyzing the content of segments of the media stream between any two or more remaining potential music video boundaries to determine whether a segment of the media stream between any two or more potential music video boundaries represents an actual music video. - View Dependent Claims (18, 19, 20)
-
-
21. A system for automatically parsing music videos in a composite audio-video media stream, comprising:
-
preprocessing a composite audio-video media stream using a plurality of boundary detection methods to locate a plurality of potential music video boundaries within the media stream; and
analyzing the content of segments of the media stream between any located potential music video boundaries to determine whether the content between any two or more potential music video boundaries represents a complete music video. - View Dependent Claims (22, 23, 24)
-
Specification