Apparatus and method for classification and segmentation of audio content, based on the audio signal
First Claim
1. An apparatus for classifying an input audio signal into audio contents of a first class and of a second class, the apparatus comprising:
- an audio segmentation module adapted to segment said input audio signal into one or more of segments of a predetermined length;
a feature computation module adapted to calculate for each of said one or more segments one or more features characterizing said audio input signal;
a threshold comparison module adapted to generate a feature vector for each of said one or more segments by comparing the one or more features in each segment with a plurality of predetermined thresholds, the plurality of predetermined thresholds including for each of the audio contents of the first class and of the second class a substantially near certainty threshold, a substantially high certainty threshold, and a substantially low certainty threshold, wherein each threshold of the plurality of thresholds represents a statistical measure relating to the one or more features; and
a classification module adapted to analyze the feature vector and classify each one of said one or more segments as audio contents of the first class, of the second class, or as non-decisive audio contents;
wherein a segment is classified as audio contents of the first class when the feature vector includes at least one feature surpassing the substantially near certainty threshold of the first class and no features surpassing the substantially near certainty threshold and the substantially high certainty threshold of the second class;
wherein the classification module is further adapted to, at one or more subsequent intermediate classifications stages, to classify a non-decisive segment as audio contents of the first class when a majority of features in the feature vector surpass the substantially high certainty threshold of the first class and no features surpass the substantially high certainty threshold of the second threshold; and
wherein the classification module is further adapted to, at a subsequent separation classifications stage, classify segments of non-decisive audio contents into audio contents of the first class or of the second class.
1 Assignment
0 Petitions
Accused Products
Abstract
An apparatus for classifying an input audio signal into audio contents of a first and second class, comprising an audio segmentation module adapted to segment said input audio signal into segments of a predetermined length; a feature computation module adapted to calculate for the segments features characterizing said audio input signal; a threshold comparison module adapted to generate a feature vector for each of said one or more segments based on a plurality of predetermined thresholds, the thresholds including for each of the audio contents of the first class and of the second class a substantially near certainty threshold, a substantially high certainty threshold, and a substantially low certainty threshold; and a classification module adapted to analyze the feature vector and classify each one of said one or more segments as audio contents of the first class, of the second class, or as non-decisive audio contents.
-
Citations
18 Claims
-
1. An apparatus for classifying an input audio signal into audio contents of a first class and of a second class, the apparatus comprising:
-
an audio segmentation module adapted to segment said input audio signal into one or more of segments of a predetermined length; a feature computation module adapted to calculate for each of said one or more segments one or more features characterizing said audio input signal; a threshold comparison module adapted to generate a feature vector for each of said one or more segments by comparing the one or more features in each segment with a plurality of predetermined thresholds, the plurality of predetermined thresholds including for each of the audio contents of the first class and of the second class a substantially near certainty threshold, a substantially high certainty threshold, and a substantially low certainty threshold, wherein each threshold of the plurality of thresholds represents a statistical measure relating to the one or more features; and a classification module adapted to analyze the feature vector and classify each one of said one or more segments as audio contents of the first class, of the second class, or as non-decisive audio contents;
wherein a segment is classified as audio contents of the first class when the feature vector includes at least one feature surpassing the substantially near certainty threshold of the first class and no features surpassing the substantially near certainty threshold and the substantially high certainty threshold of the second class;
wherein the classification module is further adapted to, at one or more subsequent intermediate classifications stages, to classify a non-decisive segment as audio contents of the first class when a majority of features in the feature vector surpass the substantially high certainty threshold of the first class and no features surpass the substantially high certainty threshold of the second threshold; and
wherein the classification module is further adapted to, at a subsequent separation classifications stage, classify segments of non-decisive audio contents into audio contents of the first class or of the second class. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method for segmenting an input audio signal into audio contents of a first class and of a second class, the method comprising:
-
separating said input audio signal into one or more of segments of a predetermined length; calculating for each of said one or more segment one or more features characterizing said audio input signal; generating a feature vector for each of said one or more segments by comparing the one or more features in each segment with a plurality of predetermined thresholds, the plurality of predetermined thresholds including for each of the audio contents of the first class and of the second class a substantially near certainty threshold, a substantially high certainty threshold, and a substantially low certainty threshold, wherein each threshold of the plurality of thresholds represents a statistical measure relating to the one or more features; and analyzing the feature vector and classifying each one of said one or more segments as audio contents of the first class, of the second class, or as non-decisive audio contents; wherein a segment is classified as audio contents of the first class when the feature vector includes at least one feature surpassing the substantially near certainty threshold of the first class and no features surpassing the substantially near certainty threshold and the substantially high certainty threshold of the second class;
wherein the classification module is further adapted to, at one or more subsequent intermediate classifications stages, to classify a non-decisive segment as audio contents of the first class when a majority of features in the feature vector surpass the substantially high certainty threshold of the first class and no features surpass the substantially high certainty threshold of the second class; and
wherein the classification module is further adapted to, at a subsequent separation classifications stage, classify segments of non-decisive audio contents into audio contents of the first class or of the second class. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
-
-
18. A system for segmenting audio content into a first class and a second class, the system comprising:
-
an apparatus for segmenting an input audio signal into audio contents of a first class and of a second class, the apparatus comprising an audio segmentation module adapted to separate said input audio signal into one or more segments of a predetermined length;
a feature computation module adapted to calculate for each segment in the said one or more segments one or more features characterizing said audio input signal;
a threshold comparison module adapted to generate a feature vector for each segment in the said one or more segments by comparing the one or more features in each segment with a plurality of predetermined thresholds, the plurality of predetermined thresholds including for each of the audio contents of the first class and of the second class a substantially near certainty threshold, a substantially high certainty threshold, and a substantially low certainty threshold, wherein each threshold of the plurality of thresholds represents a statistical measure relating to the one or more features; and
a classification module adapted to analyze the feature vector and classify each segment in the said one or more segments as audio contents of the first class, of the second class, or as non-decisive audio contents;
wherein a segment is classified as audio contents of the first class when the feature vector includes at least one feature surpassing the substantially near certainty threshold of the first class and no features surpassing the substantially near certainty threshold and the substantially high certainty threshold of the second class;
wherein the classification module is further adapted to, at one or more subsequent intermediate classifications stages, to classify a non-decisive segment as audio contents of the first class when a majority of features in the feature vector surpass the substantially high certainty threshold of the first class and no features surpass the substantially high certainty threshold of the second class; and
wherein the classification module is further adapted to, at a subsequent separation classifications stage, classify segments of non-decisive audio contents into audio contents of the first class or of the second class;an audio interface unit for transferring the input audio signal from an audio source to the apparatus; and a processing unit for processing the audio content classified into the first class and the second class.
-
Specification