Sound features extracting apparatus, sound data registering apparatus, sound data retrieving apparatus, and methods and programs for implementing the same
First Claim
1. A sound features extracting apparatus comprising:
- an audio signal input part which receives an audio signal of sound data including predetermined time frames;
a first frequency analyzer which analyzes a plurality of frequency bands of each of the predetermined time frames of said audio signal received from said audio signal input part, and which outputs a signal for each of the frequency bands;
a rise component calculator which detects a rise component in said signal of each of the frequency bands received from said first frequency analyzer, and which sums said rise components to determine a rise component for each time frame;
an auto-correlation function calculator which calculates an auto-correlation function of said rise components;
a second frequency analyzer which analyzes said auto-correlation function calculated by said auto-correlation function calculator, and which outputs a signal for each of the frequency bands;
a direct-current component detector which detects a direct current component in said signal outputted from said second frequency analyzer;
a peak detector which detects a signal of each of the frequency bands which is maximum in the power from said signal outputted from said second frequency analyzer; and
a ratio calculator which divides the power of said output of said direct-current component detector by the power of said output of said peak detector, whereinsaid sound features extracting apparatus calculates a non-periodic property of sound emission which is a primary feature of said audio signal.
2 Assignments
0 Petitions
Accused Products
Abstract
The present invention implements a method and an apparatus for retrieving a sound data desired by the user on the basis of its subjective impression over the sound data. The subjective impression on the desired sound data is entered by the user and converted to a numerical value. A target sound impression value which is a numerical form of the impression on the sound data is calculated from the numerical value. The target sound impression value is then used as a retrieving key for accessing a sound database where the audio signal and the sound features of a plurality of the sound data are stored. This allows the desired sound data to be retrieved on the basis of the subjective impression of the user on the sound data.
62 Citations
17 Claims
-
1. A sound features extracting apparatus comprising:
-
an audio signal input part which receives an audio signal of sound data including predetermined time frames; a first frequency analyzer which analyzes a plurality of frequency bands of each of the predetermined time frames of said audio signal received from said audio signal input part, and which outputs a signal for each of the frequency bands; a rise component calculator which detects a rise component in said signal of each of the frequency bands received from said first frequency analyzer, and which sums said rise components to determine a rise component for each time frame; an auto-correlation function calculator which calculates an auto-correlation function of said rise components; a second frequency analyzer which analyzes said auto-correlation function calculated by said auto-correlation function calculator, and which outputs a signal for each of the frequency bands; a direct-current component detector which detects a direct current component in said signal outputted from said second frequency analyzer; a peak detector which detects a signal of each of the frequency bands which is maximum in the power from said signal outputted from said second frequency analyzer; and a ratio calculator which divides the power of said output of said direct-current component detector by the power of said output of said peak detector, wherein said sound features extracting apparatus calculates a non-periodic property of sound emission which is a primary feature of said audio signal.
-
-
2. A sound features extracting apparatus comprising:
-
an audio signal input part which receives an audio signal of sound data including predetermined time frames; a frequency analyzer which analyzes a plurality of frequency bands of each of the predetermined time frames of said audio signal received from said audio signal input part, and which outputs a signal for each of the frequency bands; a rise component calculator which detects a rise component in said signal of each of the frequency bands received from said frequency analyzer, and which sums said rise components to determine a rise component for each time frame; an auto-correlation function calculator which calculates an auto-correlation function of said rise components obtained from said rise component calculator; a peak calculator which calculates a position and an amplitude of each peak in said signal outputted from said auto-correlation function calculator; a tempo interval time candidate calculator which calculates some candidates for a tempo interval time of said sound data from said peaks of said auto-correlation function calculated by said peak calculator; a cycle structure calculator which calculates a cycle structure of said sound data from said peaks of said auto-correlation function calculated by said peak calculator; and a tempo interval time detector which determines a value of a most likely tempo interval time of said sound data from said candidates calculated by said tempo interval time candidate calculator with reference to said signal outputted from said rise component calculator and said signal outputted from said cycle structure calculator, wherein said sound features extracting apparatus calculates a tempo interval time which is a primary feature of said audio signal. - View Dependent Claims (3, 4)
-
-
5. A sound features extracting apparatus comprising:
-
an audio signal input part which receives an audio signal of sound data including predetermined time frames; a first frequency analyzer which analyzes a plurality of frequency bands of each of the predetermined time frames of said audio signal received from said audio signal input part, and which outputs a signal for each of the frequency bands; a rise component calculator which detects a rise component in said signal of each of the frequency bands received from said first frequency analyzer, and which sums said rise components to determine said rise component for each time frame; an auto-correlation function calculator which calculates an auto-correlation function of said rise components outputted from said rise component calculator; a first peak calculator which calculates a position and an amplitude of each peak in said signal outputted from said auto-correlation function calculator; a tempo interval time candidate calculator which calculates some candidates for a tempo interval time of said sound data from said peaks of said auto-correlation function calculated by said first peak calculator; a cycle structure calculator which calculates a cycle structure of said sound data from said peaks of the auto-correlation function calculated by said first peak calculator; a tempo interval time detector which determines a value of a most likely tempo interval time of said sound data from said candidates calculated by said tempo interval time candidate calculator with reference to said signal outputted from said rise component calculator and said signal outputted from said cycle structure calculator; a second frequency analyzer which analyzes said auto-correlation function and which outputs a signal for each of the frequency bands; a second peak detector which detects a signal of each of the frequency bands which is maximum in the power from said signal outputted from said second frequency analyzer; and a ratio calculator which calculates a ratio between said tempo interval time of said sound data outputted from said tempo interval time detector and said values outputted from said second peak detector, wherein said sound features extracting apparatus calculates a ratio of the tempo interval time which is a primary feature of the audio signal.
-
-
6. A sound features extracting apparatus comprising:
-
an audio signal input part which receives an audio signal of sound data including predetermined time frames; a first frequency analyzer which analyzes a plurality of frequency bands of each of the predetermined time frames of said audio signal received from the audio signal input part, and which outputs a signal for each of the frequency bands; a rise component calculator which detects a rise component in said signal of each of the frequency bands received from said first frequency analyzer, and which sums said rise components to determine said rise component for each time frame; an auto-correlation function calculator which calculates an auto-correlation function of said rise components outputted from said the rise component calculator; a peak calculator which calculates a position and an amplitude of each peak in said signal outputted from said auto-correlation function calculator; a tempo interval time candidate calculator which calculates some candidates for a tempo interval time of said sound data from said peaks of said auto-correlation function calculated by said peak calculator; a cycle structure calculator which calculates a cycle structure of said sound data from said peaks of the auto-correlation function calculated by said peak calculator; a tempo interval time detector which determines a value of a most likely tempo interval time of said sound data from said candidates calculated by said tempo interval time candidate calculator with reference to said signal outputted from said rise component calculator and said signal outputted from said cycle structure calculator; a second frequency analyzer which analyzes said auto correlation function, and to which outputs a signal for each of the frequency bands; a frequency calculator which calculates a frequency equal to said tempo interval time divided by an integer from said tempo interval time of said sound data outputted from said tempo interval time detector; and a value reference part which refers the frequency output to said second frequency analyzer, and which outputs a value which represents a peak in proximity of the frequency outputted from said frequency calculator, wherein said sound features extracting apparatus calculates said value of a beat intensity which is a primary feature of said audio signal.
-
-
7. A sound features extracting apparatus comprising:
-
an audio signal input part which receives an audio signal of sound data including predetermined time frames; a first frequency analyzer which analyzes a plurality of frequency bands of each of the predetermined time frames of said audio signal received from the audio signal input part, and which outputs a signal for each of the frequency bands; a rise component calculator which detects a rise component in said signal of each of the frequency bands received from said first frequency analyzer, and which sums said rise components to determine said rise component for each time frame; an auto-correlation function calculator which calculates an auto-correlation function of said rise components outputted from said the rise component calculator; a peak calculator which calculates a position and an amplitude of each peak in said signal outputted from said auto-correlation function calculator; a tempo interval time candidate calculator which calculates some candidates for a tempo interval time of said sound data from said peaks of said auto-correlation function calculated by said peak calculator; a cycle structure calculator which calculates a cycle structure of said sound data from said peaks of the auto-correlation function calculated by said peak calculator; a tempo interval time detector which determines a value of a most likely tempo interval time of said sound data from said candidates calculated by said tempo interval time candidate calculator with reference to said signal outputted from said rise component calculator and said signal outputted from said cycle structure calculator; a second frequency analyzer which analyzes said auto-correlation function, and to which outputs a signal for each of the frequency bands; a first frequency calculator which calculates a frequency equal to said tempo interval time divided by an integer from said tempo interval time of said sound data outputted from said tempo interval time detector; a first value reference part which refers the frequency output of said second frequency analyzer, and which outputs a value which represents a peak in proximity of the frequency output of said first frequency calculator; a second frequency calculator which calculates a frequency equal to ¼
of said tempo interval time from said tempo interval time of said sound data determined by said tempo interval time detector;a second value reference part which refers the frequency output of said second frequency analyzer and which outputs a value which represents a peak in proximity of said frequency output of said second frequency calculator; and a ratio calculator which calculates a ratio between said value output from said first value reference part and said value output from said second value reference part, wherein said sound features extracting apparatus calculates said ratio of beat intensity which is a primary feature of said audio signal.
-
-
8. A method for extracting sound features for extracting non-periodic property of sound emission from an audio signal of sound data, comprising the following steps of:
-
an input step for inputting said audio signal of said sound data including predetermined time frames; a first frequency analyzing step for analyzing a plurality of frequency bands of each of the predetermined time frames of said audio signal received from said audio signal input step, and outputting a signal for each of the frequency bands; a rise component calculating step for detecting a rise component in said signal of each of the frequency bands received from said first frequency analyzing step, and summing said rise components to determine a rise component for each time frame; an auto-correlation function calculating step for calculating an auto-correlation function of said rise components; a second frequency analyzing step for analyzing said auto-correlation function calculated by said auto-correlation function calculating step, and outputting a signal for each of the frequency bands; a direct-current component detecting step for detecting a direct-current component in said signal outputted from said second frequency analyzing step; a peak detecting step for detecting a signal of each of the frequency bands which is maximum in the power from said signal outputted from said second frequency analyzing step; and a ratio calculating step for dividing the power of said output of said direct-current component detecting step by the power of said output of said peak detecting step.
-
-
9. A method for extracting sound features for extracting tempo interval time from an audio signal of sound data, comprising the following steps of:
-
an input step for inputting said audio signal of said sound data including predetermined time frames; a frequency analyzing step for analyzing a plurality of frequency bands of each of the predetermined time frames of said audio signal received from said audio signal input step, and outputting a signal for each of the frequency bands; a rise component calculating step for detecting a rise component in said signal of each of the frequency bands received from said frequency analyzing step, and summing said rise components to determine rise component for each time frame; an auto-correlation function calculating step for which calculating an auto-correlation function of said rise components obtained from said rise component calculating step; a peak calculating step for calculating a position and an amplitude of each peak in said signal outputted from said auto-correlation function calculating step; a tempo interval time candidate calculating step for calculating some candidates for a tempo interval time of said sound data from said peaks of said auto-correlation function calculated by said peak calculating step; a cycle structure calculating step for calculating a cycle structure of said sound data from said peaks of said auto-correlation function calculated by said peak calculating step; and a tempo interval time detecting step for determining a value of a most likely tempo interval time of said sound data from said candidates calculated by said tempo interval time candidate calculating step with reference to said signal outputted from said rise component calculating step and said signal outputted from said cycle structure calculating step.
-
-
10. A method for extracting sound features for extracting a ratio of the tempo interval time from an audio signal of sound data, comprising the following steps of:
-
an input step for inputting said audio signal of said sound data including predetermined time frames; a first frequency analyzing step for analyzing a plurality of frequency bands of each of the predetermined time frames of said audio signal received from said audio signal input step, and outputting a signal for each of the frequency bands; a rise component calculating step for detecting a rise component in said signal of each of the frequency bands received from said first frequency analyzing step, and summing said rise components to determine said rise component for each time frame; an auto-correlation function calculating step for calculating an auto-correlation function of said rise components outputted from said rise component calculating step; a first peak calculating step for calculating a position and an amplitude of each peak in said signal outputted from said auto-correlation function calculating step; a tempo interval time candidate calculating step for calculating some candidates for a tempo interval time of said sound data from said peaks of said auto-correlation function calculated by said first peak calculating step; a cycle structure calculating step for calculating a cycle structure of said sound data from said peaks of the auto correlation function calculated by said first peak calculating step; a tempo interval time detecting step for determining a value of a most likely tempo interval time of said sound data from said candidates calculated by said tempo interval time candidate calculating step with reference to said signal outputted from said rise component calculating step and said signal outputted from said cycle structure calculating step; a second frequency analyzing step for analyzing said auto-correlation function and which outputs a signal for each of the frequency bands; a second peak detecting step for detecting a signal of each of the frequency bands which is maximum in the power from said signal outputted from said second frequency analyzing step; and a ratio calculating step for calculating a ratio between said tempo interval time of said sound data outputted from said tempo interval time detecting step and said values outputted from said second peak detecting step.
-
-
11. A method for extracting sound features for extracting a value of a beat intensity from an audio signal of sound data, comprising the following steps of:
-
an input step for inputting said audio signal of said sound data including predetermined time frames; a first frequency analyzing step for analyzing a plurality of frequency bands of each of the predetermined time frames of said audio signal received from the audio signal input step, and outputting a signal for each of the frequency bands; a rise component calculating step for detecting a rise component in said signal of each of the frequency bands received from said first frequency analyzing step, and summing said rise components to determine said rise component for each time frame; an auto-correlation function calculating step for calculating an auto-correlation function of said rise components outputted from said the rise component calculating step; a peak calculating step for calculating a position and an amplitude of each peak in said signal outputted from said auto-correlation function calculating step; a tempo interval time candidate calculating step for calculating some candidates for a tempo interval time of said sound data from said peaks of said auto-correlation function calculated by said peak calculating step; a cycle structure calculating step for calculating a cycle structure of said sound data from said peaks of the auto-correlation function calculated by said peak calculating step; a tempo interval time detecting step for determining a value of a most likely tempo interval time of said sound data from said candidates calculated by said tempo interval time candidate calculating step with reference to said signal outputted from said rise component calculating step and said signal outputted from said cycle structure calculating step; a second frequency analyzing step for analyzing said auto-correlation function, and outputting a signal for each of the frequency bands; a frequency calculating step for calculating a frequency equal to said tempo interval time divided by an integer from said tempo interval time of said sound data outputted from said tempo interval time detecting step; and a value referring step for referring the frequency output to said second frequency analyzing step, and outputting a value which represents a peak in proximity of the frequency outputted from said frequency calculating step.
-
-
12. A method for extracting sound features for extracting a ratio of beat intensity from an audio signal of sound data, comprising the following steps of:
-
an input step for inputting said audio signal of said sound data including predetermined time frames; a first frequency analyzing step for analyzing a plurality of frequency bands of each of the predetermined time frames of said audio signal received from the audio signal input step, and outputting a signal for each of the frequency bands; a rise component calculating step for detecting a rise component in said signal of each of the frequency bands received from said first frequency analyzing step, and summing said rise components to determine said rise component for each time frame; an auto-correlation function calculating step for calculating an auto-correlation function of said rise components outputted from said the rise component calculating step; a peak calculating step for calculating a position and an amplitude of each peak in said signal outputted from said auto-correlation function calculating step; a tempo interval time candidate calculating step for calculating some candidates for a tempo interval time of said sound data from said peaks of said auto-correlation function calculated by said peak calculating step; a cycle structure calculating step for calculating a cycle structure of said sound data from said peaks of the auto-correlation function calculated by said peak calculating step; a tempo interval time detecting step for determining a value of a most likely tempo interval time of said sound data from said candidates calculated by said tempo interval time candidate calculating step with reference to said signal outputted from said rise component calculating step and said signal outputted from said cycle structure calculating step; a second frequency analyzing step for analyzing said auto-correlation function, and outputting a signal for each of the frequency bands; a first frequency calculating step for calculating a frequency equal to said tempo interval time divided by an integer from said tempo interval time of said sound data outputted from said tempo interval time detecting step; a first value referring step for referring the frequency output of said second frequency analyzing step, and outputting a value which represents a peak in proximity of the frequency output of said first frequency calculating step; a second frequency calculating step for calculating a frequency equal to ¼
of said tempo interval time from said tempo interval time of said sound data determined by said tempo interval time detecting step;a second value referring step for referring the frequency output of said second frequency analyzing step and outputting a value which represents a peak in proximity of said frequency output of said second frequency calculating step; and a ratio calculating step for calculating a ratio between said value output from said first value referring step and said value output from said second value referring step.
-
-
13. A computer readable medium including a program for extracting sound features for extracting non-periodic property of sound emission from an audio signal of sound data, comprising the following steps of:
-
an input step for inputting said audio signal of said sound data including predetermined time frames; a first frequency analyzing step for analyzing a plurality of frequency bands of each of the predetermined time frames of said audio signal received from said audio signal input step, and outputting a signal for each of the frequency bands; a rise component calculating step for detecting a rise component in said signal of each of the frequency bands received from said first frequency analyzing step, and summing said rise components to determine rise component for each time frame; an auto-correlation function calculating step for calculating an auto-correlation function of said rise components; a second frequency analyzing step for analyzing said auto correlation function calculated by said auto-correlation function calculating step, and outputting a signal for each of the frequency bands; a direct-current component detecting step for detecting a direct-current component in said signal outputted from said second frequency analyzing step; a peak detecting step for detecting a signal of each of the frequency bands which is maximum in the power from said signal outputted from said second frequency analyzing step; and a ratio calculating step for dividing the power of said output of said direct-current component detecting step by the power of said output of said peak detecting step.
-
-
14. A computer readable medium including a program for extracting sound features for extracting tempo interval time from an audio signal of sound data, comprising the following steps of:
-
an input step for inputting said audio signal of said sound data including predetermined time frames; a frequency analyzing step for analyzing a plurality of frequency bands of each of the predetermined time frames of said audio signal received from said audio signal input step, and outputting a signal for each of the frequency bands; a rise component calculating step for detecting a rise component in said signal of each of the frequency bands received from said frequency analyzing step, and summing said rise components to determine rise component for each time frame; an auto-correlation function calculating step for which calculating an auto-correlation function of said rise components obtained from said rise component calculating step; a peak calculating step for calculating a position and an amplitude of each peak in said signal outputted from said auto-correlation function calculating step; a tempo interval time candidate calculating step for calculating some candidates for a tempo interval time of said sound data from said peaks of said auto-correlation function calculated by said peak calculating step; a cycle structure calculating step for calculating a cycle structure of said sound data from said peaks of said auto-correlation function calculated by said peak calculating step; and a tempo interval time detecting step for determining a value of a most likely tempo interval time of said sound data from said candidates calculated by said tempo interval time candidate calculating step with reference to said signal outputted from said rise component calculating step and said signal outputted from said cycle structure calculating step.
-
-
15. A computer readable medium including a program for extracting sound features for extracting a ratio of the tempo interval time from an audio signal of sound data, comprising the following steps of:
-
an input step for inputting said audio signal of said sound data including predetermined time frames; a first frequency analyzing step for analyzing a plurality of frequency bands of each of the predetermined time frames of said audio signal received from said audio signal input step, and outputting a signal for each of the frequency bands; a rise component calculating step for detecting a rise component in said signal of each of the frequency bands received from said first frequency analyzing step, and summing said rise components to determine said rise component for each time frame; an auto-correlation function calculating step for calculating an auto-correlation function of said rise components outputted from said rise component calculating step; a first peak calculating step for calculating a position and an amplitude of each peak in said signal outputted from said auto-correlation function calculating step; a tempo interval time candidate calculating step for calculating some candidates for a tempo interval time of said sound data from said peaks of said auto-correlation function calculated by said first peak calculating step; a cycle structure calculating step for calculating a cycle structure of said sound data from said peaks of the auto-correlation function calculated by said first peak calculating step; a tempo interval time detecting step for determining a value of a most likely tempo interval time of said sound data from said candidates calculated by said tempo interval time candidate calculating step with reference to said signal outputted from said rise component calculating step and said signal outputted from said cycle structure calculating step; a second frequency analyzing step for analyzing said auto-correlation function and which outputs a signal for each of the frequency bands; a second peak detecting step for detecting a signal of each of the frequency bands which is maximum in the power from said signal outputted from said second frequency analyzing step; and a ratio calculating step for calculating a ratio between said tempo interval time of said sound data outputted from said tempo interval time detecting step and said values outputted from said second peak detecting step.
-
-
16. A computer readable medium including a program for extracting sound features for extracting a value of a beat intensity from an audio signal of sound data, comprising the following steps of:
-
an input step for inputting said audio signal of said sound data including predetermined time frames; a first frequency analyzing step for analyzing a plurality of frequency bands of each of the predetermined time frames of said audio signal received from the audio signal input step, and outputting a signal for each of the frequency bands; a rise component calculating step for detecting a rise component in said signal of each of the frequency bands received from said first frequency analyzing step, and summing said rise components to determine said rise component for each time frame; an auto-correlation function calculating step for calculating an auto-correlation function of said rise components outputted from said the rise component calculating step; a peak calculating step for calculating a position and an amplitude of each peak in said signal outputted from said auto-correlation function calculating step; a tempo interval time candidate calculating step for calculating some candidates for a tempo interval time of said sound data from said peaks of said auto-correlation function calculated by said peak calculating step; a cycle structure calculating step for calculating a cycle structure of said sound data from said peaks of the auto-correlation function calculated by said peak calculating step; a tempo interval time detecting step for determining a value of a most likely tempo interval time of said sound data from said candidates calculated by said tempo interval time candidate calculating step with reference to said signal outputted from said rise component calculating step and said signal outputted from said cycle structure calculating step; a second frequency analyzing step for analyzing said auto-correlation function, and outputting a signal for each of the frequency bands; a frequency calculating step for calculating a frequency equal to said tempo interval time divided by an integer from said tempo interval time of said sound data outputted from said tempo interval time detecting step; and a value referring step for referring the frequency output to said second frequency analyzing step, and outputting a value which represents a peak in proximity of the frequency outputted from said frequency calculating step.
-
-
17. A computer readable medium including a program for extracting sound features for extracting a ratio of beat intensity from an audio signal of sound data, comprising the following steps of:
-
an input step for inputting said audio signal of said sound data including predetermined time frames; a first frequency analyzing step for analyzing a plurality of frequency bands of each of the predetermined time frames of said audio signal received from the audio signal input step, and outputting a signal for each of the frequency bands; a rise component calculating step for detecting a rise component in said signal of each of the frequency bands received from said first frequency analyzing step, and summing said rise components to determine said rise component for each time frame; an auto-correlation function calculating step for calculating an auto-correlation function of said rise components outputted from said the rise component calculating step; a peak calculating step for calculating a position and an amplitude of each peak in said signal outputted from said auto-correlation function calculating step; a tempo interval time candidate calculating step for calculating some candidates for a tempo interval time of said sound data from said peaks of said auto-correlation function calculated by said peak calculating step; a cycle structure calculating step for calculating a cycle structure of said sound data from said peaks of the auto-correlation function calculated by said peak calculating step; a tempo interval time detecting step for determining a value of a most likely tempo interval time of said sound data from said candidates calculated by said tempo interval time candidate calculating step with reference to said signal outputted from said rise component calculating step and said signal outputted from said cycle structure calculating step; a second frequency analyzing step for analyzing said auto-correlation function, and outputting a signal for each of the frequency bands; a first frequency calculating step for calculating a frequency equal to said tempo interval time divided by an integer from said tempo interval time of said sound data outputted from said tempo interval time detecting step; a first value referring step for referring the frequency output of said second frequency analyzing step, and outputting a value which represents a peak in proximity of the frequency output of said first frequency calculating step; a second frequency calculating step for calculating a frequency equal to ¼
of said tempo interval time from said tempo interval time of said sound data determined by said tempo interval time detecting step;a second value referring step for referring the frequency output of said second frequency analyzing step and outputting a value which represents a peak in proximity of said frequency output of said second frequency calculating step; and a ratio calculating step for calculating a ratio between said value output from said first value referring step and said value output from said second value referring step.
-
Specification