Device, method, and medium for generating audio fingerprint and retrieving audio data
First Claim
1. A method of providing a mobile device with information associated with audio data, the method comprising:
- obtaining, performed by at least one processor, a segment of the audio data;
generating, performed by the at least one processor, a modulation spectrum by performing Fourier transform on Modified Discrete Cosine Transform (MDCT) coefficients of the obtained segment;
generating, performed by the at least one processor, audio fingerprint data from the generated modulation spectrum;
identifying, performed by the at least one processor, the audio data by comparing the generated audio fingerprint data with a plurality of audio fingerprint data stored in a database;
retrieving, performed by the at least one processor, information corresponding to the identified audio data from a database; and
providing, performed by the at least one processor, the retrieved information to the mobile device user.
0 Assignments
0 Petitions
Accused Products
Abstract
Provided are device, method, and medium for generating an audio fingerprint and retrieving audio data. The device for generating an audio fingerprint includes: a coefficient extracting section partially decoding audio data in a compression area and extracting MDCT (Modified Discrete Cosine Transform) coefficients; a coefficient selecting section selecting an MDCT coefficient robust to noises from the extracted MDCT coefficients; a modulation spectrum generating section transforming the selected MDCT coefficient by the use of a Fourier transform method and generating a modulation spectrum; and a bit conversion section quantizing the generated modulation spectrum and generating an audio fingerprint. As a result, it is possible to accurately and rapidly retrieve the audio data recorded in a variety of environments. Since elements based on MP3 are used, it is possible to apply to MP3 applications in various manners. In addition, it is possible to apply to classification of audio data such as classification of music moods and classification of music genres and various other fields such as extraction of a specific event from moving images of sports.
-
Citations
33 Claims
-
1. A method of providing a mobile device with information associated with audio data, the method comprising:
-
obtaining, performed by at least one processor, a segment of the audio data; generating, performed by the at least one processor, a modulation spectrum by performing Fourier transform on Modified Discrete Cosine Transform (MDCT) coefficients of the obtained segment; generating, performed by the at least one processor, audio fingerprint data from the generated modulation spectrum; identifying, performed by the at least one processor, the audio data by comparing the generated audio fingerprint data with a plurality of audio fingerprint data stored in a database; retrieving, performed by the at least one processor, information corresponding to the identified audio data from a database; and providing, performed by the at least one processor, the retrieved information to the mobile device user. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A method of providing a mobile device user with information associated with audio data, the method comprising:
-
obtaining, performed by the at least one processor, a segment of audio data; generating, performed by the at least one processor, a modulation spectrum by performing Fourier transform on Modified Discrete Cosine Transform (MDCT) coefficients of the obtained segment; generating, performed by the at least one processor, audio fingerprint data from the obtained modulation spectrum; generating, performed by the at least one processor, a hashing value corresponding to the generated audio fingerprint data; identifying, performed by the at least one processor, the audio data by matching a hashing value from a hashing table storing hashing values corresponding to a plurality of audio fingerprint data with the generated hashing value, based on an adjustable threshold value; retrieving, performed by the at least one processor, information corresponding to the identified audio data from a database; and providing, performed by the at least one processor, the retrieved information to the mobile device user. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27)
-
-
28. A method of providing a mobile device user with information associated with audio data, the method comprising:
-
obtaining, performed by the at least one processor, a segment of audio data; generating, performed by the at least one processor, audio fingerprint data from the obtained segment; generating, performed by the at least one processor, a hashing value corresponding to the generated audio fingerprint data; identifying, performed by the at least one processor, the audio data by matching a hashing value from a hashing table storing hashing values corresponding to a plurality of audio fingerprint data with the generated hashing value, based on an adjustable threshold value; retrieving, performed by the at least one processor, information corresponding to the identified audio data; and providing, performed by the at least one processor, the retrieved information to the mobile device user, wherein the retrieved information corresponding to the identified audio data comprises at least one of event information and classification information, and wherein the adjustable threshold value is adjusted until a predetermined number of the hashing values from the hashing table storing the hashing values corresponding to the plurality of audio fingerprint data match with the generated hashing value.
-
-
29. A method of providing a mobile device user with information associated with audio data, the method comprising:
-
obtaining, performed by the at least one processor, a segment of audio data; generating, performed by the at least one processor, a modulation spectrum by performing Fourier transform on Modified Discrete Cosine Transform (MDCT) coefficients of the obtained segment; generating, performed by the at least one processor, audio fingerprint data from the obtained modulation spectrum; generating, performed by the at least one processor, a hashing value corresponding to the generated audio fingerprint data; identifying, performed by the at least one processor, audio data by matching a hashing value from a hashing table storing hashing values corresponding to a plurality of audio fingerprint data with the generated hashing value; filtering, performed by the at least one processor, the identified audio data; retrieving, performed by the at least one processor, information corresponding to the filtered audio data; and providing, performed by the at least one processor, the retrieved information to the mobile device user, wherein the retrieved information corresponding to the identified audio data comprises at least one of event information and classification information, and wherein the identifying and the filtering are performed based on threshold values. - View Dependent Claims (30)
-
-
31. A method of providing a mobile device user with information associated with audio data, the method comprising:
-
obtaining, performed by the at least one processor, a segment of audio data; generating, performed by the at least one processor, a modulation spectrum by performing Fourier transform on Modified Discrete Cosine Transform (MDCT) coefficients of the obtained segment; generating, performed by the at least one processor, audio fingerprint data from the obtained modulation spectrum; generating, performed by the at least one processor, a hashing value corresponding to the generated audio fingerprint data; identifying, performed by the at least one processor, audio data by matching a hashing value from a hashing table storing hashing values corresponding to a plurality of audio fingerprint data with the generated hashing value; filtering, performed by the at least one processor, the identified audio data; retrieving, performed by the at least one processor, information corresponding to the filtered audio data; and providing, performed by the at least one processor, the retrieved information to the mobile device user, wherein the retrieved information corresponding to the identified audio data comprises product information including at least one of audio data, image data, and text data corresponding to a specific event of a broadcast moving image. - View Dependent Claims (32)
-
-
33. A method of providing a mobile device with information associated with audio data, the method comprising:
-
obtaining, performed by at least one processor, a segment of the audio data; generating, performed by the at least one processor, a modulation spectrum by performing Fourier transform on Modified Discrete Cosine Transform (MDCT) coefficients of the obtained segment; generating, performed by the at least one processor, audio fingerprint data from the obtained modulation spectrum; identifying, performed by the at least one processor, the audio data by comparing the generated audio fingerprint data with a plurality of audio fingerprint data stored in a database; retrieving, performed by the at least one processor, information corresponding to the identified audio data from a database; and providing, performed by the at least one processor, the retrieved information to the mobile device user, wherein the retrieved information corresponding to the identified audio data comprises product information including at least two of audio data, image data, and text data corresponding to a specific event of a broadcast moving image.
-
Specification