Pitch shift resistant audio matching
First Claim
1. A system comprising:
- a memory that has stored thereon computer executable components; and
a processor that executes the following computer executable components stored in the memory;
an input component that receives a video sample;
a fingerprint component that generates a melody fingerprint and an audio-id fingerprint based on an audio track of the video sample;
a melody matching component that identifies a set of potential audio matches for the audio track based on comparing the melody fingerprint to reference melody fingerprints for the potential audio matches of the set;
an audio-id matching component that identifies reference audio-id fingerprints respectively associated with the potential audio matches of the set;
a pitch shift evaluation component that determines an estimated amount of pitch shift between the audio track and the reference audio-id fingerprints; and
a pitch variation component that generates sets of pitch modified fingerprints for each of the reference audio-id fingerprints based on the estimated amount of pitch shift,wherein the audio-id matching component identifies a subset of the set of the potential audio matches based on comparing the audio-id fingerprint to the reference audio-id fingerprints and the sets of the pitch modified fingerprints.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods are provided herein relating to audio matching. Both melody fingerprints and audio-id fingerprints can be used to improve an audio matching system'"'"'s resistance to pitch shifts. A melody fingerprint can be used to identify a set of potential melody matches. Varying pitch shifted audio-id reference fingerprints can be generated for audio-id fingerprints associated with the potential matches identified in melody matching. Additional pitch shifted audio-id fingerprints of a reference sample are generated and used in matching only if an audio sample has previously been matched to a melody fingerprint of the same reference sample. A reference index need not be expanded to include pitch shifted variations of each reference sample as pitch shifted variations of audio-id fingerprint reference samples are generated and used only if their associated melody fingerprint is deemed a potential match.
-
Citations
15 Claims
-
1. A system comprising:
-
a memory that has stored thereon computer executable components; and a processor that executes the following computer executable components stored in the memory; an input component that receives a video sample; a fingerprint component that generates a melody fingerprint and an audio-id fingerprint based on an audio track of the video sample; a melody matching component that identifies a set of potential audio matches for the audio track based on comparing the melody fingerprint to reference melody fingerprints for the potential audio matches of the set; an audio-id matching component that identifies reference audio-id fingerprints respectively associated with the potential audio matches of the set; a pitch shift evaluation component that determines an estimated amount of pitch shift between the audio track and the reference audio-id fingerprints; and a pitch variation component that generates sets of pitch modified fingerprints for each of the reference audio-id fingerprints based on the estimated amount of pitch shift, wherein the audio-id matching component identifies a subset of the set of the potential audio matches based on comparing the audio-id fingerprint to the reference audio-id fingerprints and the sets of the pitch modified fingerprints. - View Dependent Claims (2, 3, 4, 9)
-
-
5. A method, comprising:
using a processor to execute computer executable components stored on a computer readable medium to perform the following acts; receiving a video sample; generating a melody fingerprint based on an audio track of the video sample; generating an audio-id fingerprint based on the audio track of the video sample identifying a set of potential audio matches for the audio track based on comparing the melody fingerprint to reference melody fingerprints for the potential audio matches of the set; identifying a set of reference audio-id fingerprints respectively associated with the potential audio matches of the set; determining an estimated amount of pitch shift between the audio track and the reference audio-id fingerprints; generating sets of pitch modified fingerprints for each of the reference audio-id fingerprints based on the estimated amount of pitch shift; and identifying a subset of the set of the potential audio-id matches based on comparing the audio-id fingerprint to the reference audio-id fingerprints and the sets of the pitch modified fingerprints. - View Dependent Claims (6, 7, 8, 10)
-
11. A tangible computer-readable storage medium comprising computer-readable instructions that, in response to execution, cause a computing system to perform operations, comprising:
-
receiving, from a device, a melody fingerprint and an audio-id fingerprint associated with an audio track; identifying a set of potential audio matches for the audio track based on comparing the melody fingerprint to reference melody fingerprints for the potential audio matches of the set; identifying a set of reference audio-id fingerprints respectively associated with the potential audio matches of the set; determining an estimated amount of pitch shift between the audio track and the reference audio-id fingerprints; generating sets of pitch modified fingerprints for each of the reference audio-id fingerprints based on the estimated amount of pitch shift; and identifying a subset of the set of the potential audio-id matches based on comparing the audio-id fingerprint to the reference audio-id fingerprints and the sets of the pitch modified fingerprints. - View Dependent Claims (12, 13, 14, 15)
-
Specification