Audio fingerprinting based on audio energy characteristics
First Claim
1. A method of audio fingerprinting comprising:
- obtaining audio samples of a piece of audio, each of the audio samples corresponding to a specific time;
generating frequency representations of the audio samples, the frequency representations being divided in frequency bands;
identifying energy regions in the frequency bands, each of the energy regions being one of an increasing energy region and a decreasing energy region, an increasing energy region defined as a time region within one of the frequency bands during which audio energy increases from a start time to an end time of the time region and a decreasing energy region defined as a time region within one of the frequency bands during which audio energy decreases from a start time to an end time of the time region, wherein the identifying the energy regions includes ignoring a time region within the one of the frequency bands during which audio energy fluctuates such that net energy change during the time region is zero from a start time to an end time of the time region;
analyzing portions of the identified energy regions appearing within time windows to generate hashes of features of the piece of audio, each hash of features corresponding to portions of the identified energy regions appearing in a respective time window, each feature defined as a numeric value that encodes information representing;
a frequency band of an energy region appearing in the respective time window, whether the energy region appearing in the respective time window is an increasing energy region or whether the energy region appearing in the respective time window is a decreasing energy region, and a placement of the energy region appearing in the respective time window, the placement of the energy region appearing in the respective time window corresponding to one of;
whether the energy region appearing in the respective time window starts before and ends after the respective time window,whether the energy region appearing in the respective time window starts before and ends within the respective time window,whether the energy region appearing in the respective time window starts within and ends after the respective time window, andwhether the energy region appearing in the respective time window starts within and ends within the respective time window; and
storing each hash of features together with the specific time.
1 Assignment
0 Petitions
Accused Products
Abstract
Audio fingerprinting includes obtaining audio samples of a piece of audio, generating frequency representations of the audio samples, identifying increasing and decreasing energy regions in frequency bands of the frequency representations, and generating hashes of features of the piece of audio. Each hash of features corresponds to portions of the identified energy regions appearing in a respective time window. Each feature is defined as a numeric value that encodes information representing: a frequency band of an energy region appearing in the respective time window, whether the energy region appearing in the respective time window is an increasing energy region or whether the energy region appearing in the respective time window is a decreasing energy region, and a placement of the energy region appearing in the respective time window.
48 Citations
20 Claims
-
1. A method of audio fingerprinting comprising:
-
obtaining audio samples of a piece of audio, each of the audio samples corresponding to a specific time; generating frequency representations of the audio samples, the frequency representations being divided in frequency bands; identifying energy regions in the frequency bands, each of the energy regions being one of an increasing energy region and a decreasing energy region, an increasing energy region defined as a time region within one of the frequency bands during which audio energy increases from a start time to an end time of the time region and a decreasing energy region defined as a time region within one of the frequency bands during which audio energy decreases from a start time to an end time of the time region, wherein the identifying the energy regions includes ignoring a time region within the one of the frequency bands during which audio energy fluctuates such that net energy change during the time region is zero from a start time to an end time of the time region; analyzing portions of the identified energy regions appearing within time windows to generate hashes of features of the piece of audio, each hash of features corresponding to portions of the identified energy regions appearing in a respective time window, each feature defined as a numeric value that encodes information representing;
a frequency band of an energy region appearing in the respective time window, whether the energy region appearing in the respective time window is an increasing energy region or whether the energy region appearing in the respective time window is a decreasing energy region, and a placement of the energy region appearing in the respective time window, the placement of the energy region appearing in the respective time window corresponding to one of;whether the energy region appearing in the respective time window starts before and ends after the respective time window, whether the energy region appearing in the respective time window starts before and ends within the respective time window, whether the energy region appearing in the respective time window starts within and ends after the respective time window, and whether the energy region appearing in the respective time window starts within and ends within the respective time window; and storing each hash of features together with the specific time. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system for audio fingerprinting comprising:
-
a sampler configured to obtain audio samples of a piece of audio, each of the audio samples corresponding to a specific time; a transformer configured to transform the audio samples into frequency representations of the audio samples, the frequency representations being divided in frequency bands; an energy streamer configured to identify energy regions in the frequency bands, each of the energy regions being one of an increasing energy region and a decreasing energy region, an increasing energy region defined as a time region within a frequency band, of the frequency bands, during which audio energy increases from a start time to an end time of the time region and a decreasing energy region defined as a time region within a frequency band, of the frequency bands, during which audio energy decreases from a start time to an end time of the time region, wherein the energy streamer is configured to ignore a time region within the one of the frequency bands during which audio energy fluctuates such that net energy change during the time region is zero from a start time to an end time of the time region; an energy hasher configured to analyze portions of the identified energy regions appearing within time windows to generate hashes of features of the piece of audio, each hash of features corresponding to portions of the identified energy regions appearing in a respective time window, each feature defined as a numeric value that encodes information representing;
a frequency band of an energy region appearing in the respective time window, whether the energy region appearing in the respective time window is an increasing energy region or whether the energy region appearing in the respective time window is a decreasing energy region, and a placement of the energy region appearing in the respective time window, the placement of the energy region appearing in the respective time window corresponding to one of;whether the energy region appearing in the respective time window starts before and ends after the respective time window, whether the energy region appearing in the respective time window starts before and ends within the respective time window, whether the energy region appearing in the respective time window starts within and ends after the respective time window, and whether the energy region appearing in the respective time window starts within and ends within the respective time window; and a non-transitory storage medium configured to store each hash of features together with the specific time. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A device for audio fingerprinting comprising:
-
a processor; and a non-transitory computer-readable medium, the processor configured to receive audio samples of a piece of audio, each of the audio samples corresponding to a specific time, process the audio samples, and compare the processed audio samples to processed audio samples stored in the non-transitory computer-readable medium to at least one of identify or synchronize the piece of audio, wherein the processor is configured to process the audio samples by; transforming the audio samples into frequency representations of the audio samples, the frequency representations being divided in frequency bands; identifying energy regions within the frequency bands, each of the energy regions being one of an increasing energy region and a decreasing energy region, an increasing energy region defined as a time region within one of the frequency bands during which audio energy increases from a start time to an end time of the time region and a decreasing energy region defined as a time region within one of the frequency bands during which audio energy decreases from a start time to an end time of the time region, wherein the identifying the energy regions includes ignoring a time region within the one of the frequency bands during which audio energy on average does not increase or decrease from a start time to an end time of the time region; analyzing portions of the identified energy regions appearing within time windows to generate hashes of features of the piece of audio, each hash of features corresponding to portions of the identified energy regions appearing in a respective time window, each feature defined as a numeric value that encodes information representing;
a frequency band of an energy region appearing in the respective time window, whether the energy region appearing in the respective time window is an increasing energy region or whether the energy region appearing in the respective time window is a decreasing energy region, and a placement of the energy region appearing in the respective time window, the placement of the energy region appearing in the respective time window corresponding to one of;whether the energy region appearing in the respective time window starts before and ends after the respective time window, whether the energy region appearing in the respective time window starts before and ends within the respective time window, whether the energy region appearing in the respective time window starts within and ends after the respective time window, and whether the energy region appearing in the respective time window starts within and ends within the respective time window. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification