Method and article of manufacture for content-based analysis, storage, retrieval, and segmentation of audio information
First Claim
1. A method for analyzing sound files for one or more of storing, comparing, and retrieving audio data, comprising the steps of:
- (a) measuring a plurality of acoustical features of a sound file chosen from the group consisting of at least one of loudness, pitch, brightness, bandwidth, and MFCC coefficients thereof;
(b) computing measurements chosen from the group consisting of mean, standard deviation, autocorrelation and first derivative thereof, of the acoustical features of the audio files, forming a vector of the feature and measurements data, and storing the computed measurements and vector in a feature file with a linkage to the sound file; and
(c) grouping the feature files based on at least one of (i) similar measurements for the feature files, and (ii) distance between the vector for a sound file and a vector for a reference, to facilitate rapid classification, storage, and/or retrieval of sound files based on a predefined search criteria.
8 Assignments
0 Petitions
Accused Products
Abstract
A system that performs analysis and comparison of audio data files based upon the content of the data files is presented. The analysis of the audio data produces a set of numeric values (a feature vector) that can be used to classify and rank the similarity between individual audio files typically stored in a multimedia database or on the World Wide Web. The analysis also facilitates the description of user-defined classes of audio files, based on an analysis of a set of audio files that are members of a user-defined class. The system can find sounds within a longer sound, allowing an audio recording to be automatically segmented into a series of shorter audio segments.
1318 Citations
20 Claims
-
1. A method for analyzing sound files for one or more of storing, comparing, and retrieving audio data, comprising the steps of:
-
(a) measuring a plurality of acoustical features of a sound file chosen from the group consisting of at least one of loudness, pitch, brightness, bandwidth, and MFCC coefficients thereof; (b) computing measurements chosen from the group consisting of mean, standard deviation, autocorrelation and first derivative thereof, of the acoustical features of the audio files, forming a vector of the feature and measurements data, and storing the computed measurements and vector in a feature file with a linkage to the sound file; and (c) grouping the feature files based on at least one of (i) similar measurements for the feature files, and (ii) distance between the vector for a sound file and a vector for a reference, to facilitate rapid classification, storage, and/or retrieval of sound files based on a predefined search criteria. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer program embodied on a computer-readable medium for analyzing audio files, comprising:
-
(a) a code segment for measuring a plurality of acoustical features of a sound file chosen from the group consisting of at least one of loudness, pitch, brightness, bandwidth, and MFCC coefficients thereof; (b) a code segment for computing measurements chosen from the group consisting of mean, standard deviation, autocorrelation and first derivative thereof, of the acoustical features of the audio files, forming a vector of the feature and measurements data, and storing the computed measurements and vector in a feature file with a linkage to the audio file; and (c) a code segment for grouping the feature files based on at least one of (i) similar measurements for the feature files, and (ii) distance between the vector for a sound file and a vector for a reference, to facilitate rapid classification, storage, and/or retrieval of sound files based on a predefined search criteria. - View Dependent Claims (9, 10, 11)
-
-
12. A content based comparison method for segmenting sound files, comprising the steps of:
-
(a) measuring a plurality of acoustical features of an sound file chosen from the group consisting of at least one of loudness, pitch, brightness, bandwidth, and MFCC coefficients thereof; (b) computing measurements chosen from the group consisting of mean, standard deviation, autocorrelation and first derivative thereof, of the acoustical features of the audio files, forming a vector of the feature and measurements data, and storing the computed measurements and vector in a feature file with a linkage to the audio file; and (c) segmenting the feature files based on at least one (i) similar measurements for the feature files, and (ii) distance between the vector for a sound file and a vector for a reference, to facilitate rapid classification, storage, and/or retrieval of sound files based on a predefined search criteria. - View Dependent Claims (13, 14, 15)
-
- 16. A computer program embodied on a computer-readable medium for segmenting sound files by using content-based statistical comparison of a plurality of acoustical features of a sound file chosen from the group consisting of at least one of loudness, pitch, brightness, bandwidth, and MFCC coefficients thereof to determine the segment boundaries.
Specification