Apparatus and method for robust classification of audio signals, and method for establishing and operating an audio-signal database, as well as computer program
First Claim
1. An apparatus for producing a fingerprint signal from an audio signal, comprising:
- a calculator for calculating energy values for frequency bands of segments of the audio signal which are successive in time, an energy value for a frequency band depending on an energy of the audio signal in the frequency band, so as to obtain a sequence of vectors of energy values from the audio signal, a vector component being an energy value in a frequency band;
a scaler for scaling the energy values to obtain a sequence of scaled vectors;
wherein the scaler includes a means for taking the logarithm and a suppressor for suppressing a steady component which is connected downstream of the means for taking the logarithm,wherein the suppressor for suppressing a steady component includes a high-pass filter;
a low pass filter for temporally filtering the sequence of scaled vectors to obtain a filtered sequence which represents the fingerprint signal, or from which the fingerprint signal maybe derived; and
a quantizer connected downstream of the filters and configured to quantize the filtered sequence or a signal based thereon so as to derive the fingerprint signal from the filtered sequence,wherein the quantizer is configured such that a width of a quantization level for a high energy value is larger than a width of a quantization level for a small energy value.
2 Assignments
0 Petitions
Accused Products
Abstract
An apparatus for producing a fingerprint signal from an audio signal includes a means for calculating energy values for frequency bands of segments of the audio signal which are successive in time, so as to obtain, from the audio signal, a sequence of vectors of energy values, a means for scaling the energy values to obtain a sequence of scaled vectors, and a means for temporal filtering of the sequence of scaled vectors to obtain a filtered sequence which represents the fingerprint, or from which the fingerprint may be derived. Thus, a fingerprint is produced which is robust against disturbances due to problems associated with coding or with transmission channels, and which is especially suited for mobile radio applications.
83 Citations
32 Claims
-
1. An apparatus for producing a fingerprint signal from an audio signal, comprising:
-
a calculator for calculating energy values for frequency bands of segments of the audio signal which are successive in time, an energy value for a frequency band depending on an energy of the audio signal in the frequency band, so as to obtain a sequence of vectors of energy values from the audio signal, a vector component being an energy value in a frequency band; a scaler for scaling the energy values to obtain a sequence of scaled vectors; wherein the scaler includes a means for taking the logarithm and a suppressor for suppressing a steady component which is connected downstream of the means for taking the logarithm, wherein the suppressor for suppressing a steady component includes a high-pass filter; a low pass filter for temporally filtering the sequence of scaled vectors to obtain a filtered sequence which represents the fingerprint signal, or from which the fingerprint signal maybe derived; and a quantizer connected downstream of the filters and configured to quantize the filtered sequence or a signal based thereon so as to derive the fingerprint signal from the filtered sequence, wherein the quantizer is configured such that a width of a quantization level for a high energy value is larger than a width of a quantization level for a small energy value. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. A method for producing a fingerprint signal from an audio signal, comprising:
-
calculating energy values for frequency bands of segments of the audio signal which are successive in time, an energy value for a frequency band depending on an energy of the audio signal in the frequency band, so as to obtain a sequence of vectors of energy values from the audio signal, a vector component being an energy value in a frequency band; scaling the energy values to obtain a sequence of scaled vectors wherein scaling comprises taking the logarithm of the energy values; and suppressing, downstream with respect to taking the logarithm, a steady component, using a high-pass filtering operation; temporally low-pass filtering the sequence of scaled vectors to obtain a filtered sequence which represents the fingerprint signal, or from which the fingerprint signal may be derived; and quantizing the filtered sequence or a signal based thereon so as to derive the fingerprint signal from the filtered sequence or from the signal based thereon, wherein a width of a quantization level for a high energy value is larger than a width of a quantization level for a small energy value.
-
-
23. An apparatus for characterizing an audio signal, comprising:
-
an apparatus or producing a fingerprint signal from an audio signal, comprising; a calculator for calculating energy values for frequency bands of segments of the audio signal which are successive in time, an energy value for a frequency band depending on an energy of the audio signal in the frequency band, so as to obtain a sequence of vectors of energy values from the audio signal, a vector component being an energy value in a frequency band; a scaler for scaling the energy values to obtain a sequence of scaled vectors wherein the scaler includes a means for taking the logarithm and a suppressor for suppressing a steady component which is connected downstream of the means for taking the logarithm, wherein the suppressor for suppressing a stead component includes a high-pass filter; a low pass filter for temporally filtering the sequence of scaled vectors to obtain a filtered sequence which represents he fingerprint signal, or from which the fingerprint signal may be derived; and a quantizer connected downstream of the filters and configured to quantize the filtered sequence or a signal based thereon so as to derive the fingerprint signal from the filtered sequence, wherein the quantizer is configured such that a width of a quantization level for a high energy value is larger than a width of a quantization level for a small energy value; and a statement-maker about the audio content of the audio signal on the grounds of the fingerprint signal.
-
-
24. A method for characterizing an audio signal, comprising:
-
producing a fingerprint signal using a method for producing a fingerprint signal from an audio signal, the method comprising; calculating energy values for frequency bands of segments of the audio signal which are successive in time, an energy value for a frequency band depending on an energy of the audio signal in the frequency band, so as to obtain a sequence of vectors of energy values from the audio signal, a vector component being an energy value in a frequency band; scaling the energy values to obtain a sequence of scaled vectors wherein scaling comprises taking the logarithm of the energy values; suppressing, downstream with respect to taking the logarithm, a steady component, using a high-pass filtering operation; temporally low-pass filtering the sequence of scaled vectors to obtain a filtered sequence which represents the fingerprint signal, or from which the fingerprint signal may be derived; and quantizing the filtered sequence or a signal based thereon so as to derive the fingerprint signal from the filtered sequence or from the signal based thereon, wherein a width of a quantization level for a high energy value is larger than a width of a quantization level for a small energy value, and making a statement about the audio content of the audio signal on the grounds of the fingerprint signal.
-
-
25. A method for establishing an audio database, comprising:
-
producing a fingerprint for each audio signal to be captured in the audio database, using the method for producing a fingerprint signal from an audio signal, the method comprising; calculating energy values for frequency bands of segments of the audio signal which are successive in time, an energy value for a frequency band depending on an energy of the audio signal in the frequency band, so as to obtain a sequence of vectors of energy values from the audio signal, a vector component being an energy value in a frequency band; scaling the energy values to obtain a sequence of scaled vectors wherein scaling comprises taking the logarithm of the energy values; suppressing, downstream with respect to taking the logarithm, a steady component, using a high-pass filtering operation; temporally low-pass filtering the sequence of scaled vectors to obtain a filtered sequence which represents the fingerprint signal, or from which the fingerprint signal may be derived; and quantizing the filtered sequence or a signal based thereon so as to derive the fingerprint signal from the filtered sequence or from the signal based thereon, wherein a width of a quantization level for a high energy value is larger than a width of a quantization level for a small energy value, for each audio signal to be captured, storing in the fingerprint as well as further information in the audio database which belongs to the audio signal, so that an association of a fingerprint and the corresponding information is given.
-
-
26. A method for obtaining information on the grounds of an audio-signal database, wherein associated fingerprint signals having been formed by a method for producing a fingerprint signal from an audio signal, the method comprising:
-
calculating energy values for frequency bands of segments of the audio signal which are successive in time, an energy value for a frequency band depending on an energy of the audio signal in the frequency band, so as to obtain a sequence of vectors of energy values from the audio signal, a vector component being an energy value in a frequency band; scaling the energy values to obtain a sequence of scaled vectors wherein scaling comprises taking the logarithm of the energy values; suppressing, downstream with respect to taking the logarithm, a steady component, using a high-pass filtering operation; temporally low-pass filtering the sequence of scaled vectors to obtain a filtered sequence which represents the fingerprint signal, or from which the fingerprint signal may be derived; and quantizing the filtered sequence or signal based thereon so as to derive the fingerprint signal from the filtered sequence or from the signal based thereon, wherein a width of a quantization level for a high energy value is larger than a width of a quantization level for a small energy value, are stored for several audio signals, and for obtaining a predefined search audio signals, the method comprising; forming a search fingerprint signal belonging to the search audio signal using a method for producing a fingerprint signal from an audio signal, comprising; calculating energy values for frequency bands of segments of the audio signal which are successive in time, an energy value for a frequency band depending on an energy of the audio signal in the frequency band, so as to obtain a sequence of vectors of energy values from the audio signal, a vector component being an energy value in a frequency band; scaling the energy values to obtain a sequence of scaled vectors wherein scaling comprises taking the logarithm of the energy values; suppressing, downstream with respect to taking the logarithm, a steady component, using a high-pass filtering operation; temporally low-pass filtering the sequence of scaled vectors to obtain a filtered sequence which represents the fingerprint signal, or from which the fingerprint signal may be derived; and quantizing the filtered sequence or a signal based thereon so as to derive the fingerprint signal from the filtered sequence or from the signal based thereon, wherein a width of a quantization level for a high energy value is larger than a width of a quantization level for a small energy value, comparing the search fingerprint signal with at least one fingerprint signal stored in the database, and making a statement about the similarity thereof.
-
-
27. The method as claimed in claimed 29, further comprising:
outputting metadata to the audio signals on which the fingerprint signals stored in the database are based, depending on the statement about the similarity of the search fingerprint signal with the fingerprint signals stored in the database.
-
28. A computer readable medium having stored thereon a computer program having a program code for performing the method for producing a fingerprint signal from an audio signal, the method comprising:
-
calculating energy values for frequency bands of segments of the audio signal which are successive in time, an energy value for a frequency band depending on an energy of the audio signal in the frequency band, so as to obtain a sequence of vectors of energy values from the audio signal, a vector component being an energy value in a frequency band; scaling the energy values to obtain a sequence of scaled vectors wherein scaling comprises taking the logarithm of the energy values; suppressing, downstream with respect to taking the logarithm, a steady component, using a high-pass filtering operation; temporally low-pass filtering the sequence of scaled vectors to obtain a filtered sequence which represents the fingerprint signal, or from which the fingerprint signal may be derived, and quantizing the filtered sequence or a signal based thereon so as to derive the fingerprint signal from the filtered sequence or from the signal based thereon, wherein a width of a quantization level for a high energy value is larger than a width of a quantization level for a small energy value when the computer program runs on a computer.
-
-
29. An apparatus for producing a fingerprint signal from an audio signal, comprising:
-
a calculator for calculating energy values for frequency bands of segments of the audio signal which are successive in time, an energy value for a frequency band depending on an energy of the audio signal in the frequency band, so as to obtain a sequence of vectors of energy values from the audio signal, a vector component being an energy value in a frequency band; a scaler for scaling the energy values to obtain a sequence of scaled vectors wherein the scaler includes a means for taking the logarithm and a suppressor for suppressing a steady component which is connected downstream of the means for taking the logarithm, wherein the suppressor for suppressing a steady component includes a high pass filter; a low-pass filter for temporally filtering the sequence of scaled vectors to obtain a filtered sequence which represents the fingerprint signal, or from which the fingerprint signal may be derived; and a quantizer connected downstream of the filters and configured to quantize the filtered sequence or a signal based thereon so as to derive the fingerprint signal from the filtered sequence, wherein the quantizer is configured to use quantization levels on the grounds of an amplitude statistic, the quantization levels being adapted in accordance with the amplitude statistic of the signal to be quantized, which statistic includes a statement about a relative frequency of values of the signal to be quantized, a fine classification of the quantizing levels being effected for a range of values with values of the signal to be quantized having a high relative abundance, and a coarse classification of the quantization levels being effected for a range of values with values of the signal to be quantized having a low relative abundance.
-
-
30. An apparatus for producing a fingerprint signal from an audio signal, comprising:
-
a calculator for calculating energy values for frequency bands of segments of the audio signal which are successive in time, an energy value for a frequency band depending on an energy of the audio signal in the frequency band, so as to obtain a sequence of vectors of energy values from the audio signal, a vector component being an energy value in a frequency band; a scaler for scaling the energy values to obtain a sequence of scaled vectors wherein the scaler includes a means for taking the logarithm and a suppressor for suppressing a steady component which is connected downstream of the means for taking the logarithm, wherein the suppressor for suppressing a steady component includes a high-pass filter; a low-pass filter for temporally filtering the sequence of scaled vectors to obtain a filtered sequence which represents the fingerprint signal, or from which the fingerprint signal may be derived; and a quantizer connected downstream of the filters and configured to quantize the filtered sequence or a signal based thereon so as to derive the fingerprint signal from the filtered sequence, wherein the quantizer comprises such a classification of the quantization levels that a maximum relative quantization error is identical for large and small energy values within a tolerance range.
-
-
31. A method for producing a fingerprint signal from an audio signal, comprising:
-
calculating energy values for frequency bands of segments of the audio signal which are successive in time, an energy value for a frequency band depending on an energy of the audio signal in the frequency band, so as to obtain a sequence of vectors of energy values from the audio signal, a vector component being an energy value in a frequency band; scaling the energy values to obtain a sequence of scaled vectors wherein scaling comprises taking the logarithm of the energy values, and suppressing, downstream with respect to taking the logarithm, a steady component, using a high-pass filtering operation; and temporally low-pass filtering the sequence of scaled vectors to obtain a filtered sequence which represents the fingerprint signal, or from which the fingerprint signal may be derived; and quantizing the filtered sequence or a signal based thereon so as to derive the fingerprint signal from the filtered sequence or the signal based thereon using such a classification of the quantization levels that a maximum relative quantization error is identical for large and small energy values within a tolerance range.
-
-
32. A method for producing a fingerprint signal from an audio signal, comprising:
-
calculating energy values for frequency bands of segments of the audio signal which are successive in time, an energy value for a frequency band depending on an energy of the audio signal in the frequency band, so as to obtain a sequence of vectors of energy values from the audio signal, a vector component being an energy value in a frequency band; scaling the energy values to obtain a sequence of scaled vectors wherein scaling comprises taking the logarithm of the energy values; and suppressing, downstream with respect to taking the logarithm, a steady component, using a high-pass filtering operation; temporally low-pass filtering the sequence of scaled vectors to obtain a filtered sequence which represents the fingerprint signal, or from which the fingerprint signal may be derived; and quantizing the filtered sequence or a signal based thereon so as to derive the fingerprint signal from the filtered sequence or the signal based thereon, wherein quantization levels on the grounds of an amplitude statistic are used, the quantization levels being adapted in accordance with the amplitude statistic of the signal to be quantized, which statistic includes a statement about a relative frequency of values of the signal to be quantized, a fine classification of the quantizing levels being effected for a range of values with values of the signal to be quantized having a high relative abundance, and a coarse classification of the quantization levels being effected for a range of values with values of the signal to be quantized having a low relative abundance.
-
Specification