Audio fingerprinting

US 10,229,689 B2
Filed: 01/27/2016
Issued: 03/12/2019
Est. Priority Date: 12/16/2013
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

accessing, by executing an instruction with at least one processor, spectral data stored in a database, the spectral data being derived from audio data and indicating a separate energy value for ones of a plurality of frequencies;

determining, by executing an instruction with the at least one processor, from the spectral data, a first group of frequencies and a second group of frequencies in the plurality of frequencies, the first group including frequencies that are higher than frequencies in the second group of frequencies;

in the first group of frequencies, identifying, by executing an instruction with the at least one processor, a first subgroup of frequencies wherein frequencies in the first subgroup have energy values that are higher than energy values of other frequencies in the first group;

in the second group of frequencies, identifying, by executing an instruction with the at least one processor, a second subgroup of frequencies wherein frequencies in the second subgroup have energy values that are higher than energy values of other frequencies in the second group;

creating, by executing an instruction with the at least one processor, a vector that assigns a first common value to frequencies in the first subgroup and assigns a second common value to frequencies in the second subgroup;

generating, by executing an instruction with the at least one processor, a sequence of permutations of the vector, the permutations differently arranging instances of the first and second common values;

generating, by executing an instruction with the at least one processor, a sequence of numbers that indicate a position of an instance of the first common value or of the second common value within a corresponding permutation among the permutations; and

reducing a computational overhead by generating, by executing an instruction with the at least one processor, a fingerprint of the audio data based on the sequence of numbers.

View all claims

12 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A machine may be configured to generate one or more audio fingerprints of one or more segments of audio data. The machine may access audio data to be fingerprinted and divide the audio data into segments. For any given segment, the machine may generate a spectral representation from the segment; generate a vector from the spectral representation; generate an ordered set of permutations of the vector; generate an ordered set of numbers from the permutations of the vector; and generate a fingerprint of the segment of the audio data, which may be considered a sub-fingerprint of the audio data. In addition, the machine or a separate device may be configured to determine a likelihood that candidate audio data matches reference audio data.

Citations

25 Claims

1. A method comprising:
- accessing, by executing an instruction with at least one processor, spectral data stored in a database, the spectral data being derived from audio data and indicating a separate energy value for ones of a plurality of frequencies;
  
  determining, by executing an instruction with the at least one processor, from the spectral data, a first group of frequencies and a second group of frequencies in the plurality of frequencies, the first group including frequencies that are higher than frequencies in the second group of frequencies;
  
  in the first group of frequencies, identifying, by executing an instruction with the at least one processor, a first subgroup of frequencies wherein frequencies in the first subgroup have energy values that are higher than energy values of other frequencies in the first group;
  
  in the second group of frequencies, identifying, by executing an instruction with the at least one processor, a second subgroup of frequencies wherein frequencies in the second subgroup have energy values that are higher than energy values of other frequencies in the second group;
  
  creating, by executing an instruction with the at least one processor, a vector that assigns a first common value to frequencies in the first subgroup and assigns a second common value to frequencies in the second subgroup;
  
  generating, by executing an instruction with the at least one processor, a sequence of permutations of the vector, the permutations differently arranging instances of the first and second common values;
  
  generating, by executing an instruction with the at least one processor, a sequence of numbers that indicate a position of an instance of the first common value or of the second common value within a corresponding permutation among the permutations; and
  
  reducing a computational overhead by generating, by executing an instruction with the at least one processor, a fingerprint of the audio data based on the sequence of numbers.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
- - 2. The method of claim 1, wherein:
    - the first common value and the second common value are equal to a shared common value; and
      
      the creating of the vector assigns the shared common value to frequencies in the first and second subgroups of frequencies.
  - 3. The method of claim 1, wherein:
    - the creating of the vector creates a binary vector; and
      
      the first common value and the second common value are equal to unity.
  - 4. The method of claim 1, wherein:
    - frequencies in the spectral data include a different ordinal position within the spectral data; and
      
      the method further includes;
      
      prior to creating the vector, weighting ones of the energy values based on an ordinal position of its corresponding frequency in the spectral data.
  - 5. The method of claim 4, wherein:
    - the weighting of ones of the energy values includes multiplying the energy value by a corresponding weight factor that indicates the ordinal position of its corresponding frequency in the spectral data.
  - 6. The method of claim 5, wherein:
    - for each energy value ones of the energy values, the corresponding weight factor is a square root of the ordinal position of the corresponding frequency.
  - 7. The method of claim 1, wherein:
    - the identifying of the first subgroup of frequencies is based on ranked energy values for the first group of frequencies; and
      
      the identifying of the second subgroup of frequencies is based on ranked energy values for the second group of frequencies.
  - 8. The method of claim 1, wherein:
    - the identifying of the first subgroup of frequencies includes ranking energy values for the first group of frequencies; and
      
      the identifying of the second subgroup of frequencies includes ranking energy values for the second group of frequencies.
  - 9. The method of claim 1, wherein:
    - the identifying of the first subgroup of frequencies includes, within the first group of frequencies, identifying frequencies whose energy values are within 0.5% of a maximum energy value for frequencies in the first group; and
      
      the identifying of the second subgroup of frequencies includes, within the second group of frequencies, identifying frequencies whose energy values are within 0.5% of a maximum energy value for frequencies in the second group.
  - 10. The method of claim 1, wherein:
    - the generating of the sequence of permutations generates an ordered plurality of unique permutations that arrange the vector differently.
  - 11. The method of claim 1, wherein:
    - the generating of the sequence of numbers generates numbers based on a lowest position of any instance of the first or second common value in the corresponding permutation.
  - 12. The method of claim 1, wherein:
    - the generating of the sequence of numbers generates numbers by calculating a remainder from a modulo operation performed on a numerical representation of a lowest position occupied by any instance of the first or second common values in the corresponding permutation.
  - 13. The method of claim 1, wherein:
    - the generating of the fingerprint of the audio data includes storing the sequence of numbers with a timestamp that indicates the audio data being fingerprinted.
  - 14. The method of claim 1, wherein:
    - the generating of the fingerprint of the audio data includes storing ones of multiple portions of the sequence of numbers in a different corresponding hash table among multiple hash tables that correspond to a timestamp that indicates the audio data being fingerprinted.
  - 15. The method of claim 1, wherein:
    - the fingerprint of the audio data is a first reference fingerprint of a first reference audio segment that precedes a second reference audio segment within reference media data; and
      
      the method further including;
      
      generating a second reference fingerprint of the second reference audio segment;
      
      generating a first candidate fingerprint of a first candidate audio segment within candidate media data;
      
      generating a second candidate fingerprint of a second candidate audio segment that follows the first candidate audio segment within the candidate media data; and
      
      determining a likelihood that the candidate media data matches the reference media data, the likelihood being determined based on a first comparison of the first candidate fingerprint to the first reference fingerprint, based on a second comparison of the second candidate fingerprint to the second reference fingerprint, and based on the first reference audio segment preceding the second reference audio segment in conjunction with the first candidate audio segment preceding the second candidate audio segment.
  - 16. The method of claim 15, wherein:
    - determining that the first reference audio segment precedes the second reference audio segment by a time span by which the first candidate audio segment precedes the second candidate audio segment; and
      
      the determining of the likelihood that the candidate media data matches the reference media data is based on the first reference audio segment preceding the second reference audio segment by the time span by which the first candidate audio segment precedes the second candidate audio segment.

17. A non-transitory machine-readable storage medium comprising instructions that, when executed by at least one processor of a machine, cause the machine to perform operations including:
- accessing spectral data stored in a database, the spectral data being derived from audio data and indicating a separate energy value for ones of a plurality of frequencies;
  
  determining, from the spectral data, a first group of frequencies and a second group of frequencies in the plurality of frequencies, the first group including frequencies that are higher than frequencies in the second group of frequencies;
  
  in the first group of frequencies, identifying a first subgroup of frequencies wherein frequencies in the first subgroup have energy values that are higher than energy values of other frequencies in the first group;
  
  in the second group of frequencies, identifying a second subgroup of frequencies wherein frequencies in the second subgroup have energy values that are higher than energy values of other frequencies in the second group;
  
  creating a vector that assigns a first common value to frequencies in the first subgroup and assigns a second common value to frequencies in the second subgroup;
  
  generating a sequence of permutations of the vector, the permutations differently arranging instances of the first and second common values;
  
  generating a sequence of numbers that indicate a position of an instance of the first common value or of the second common value within a corresponding permutation among the permutations; and
  
  reducing a computational overhead by generating a fingerprint of the audio data based on the sequence of numbers.
- View Dependent Claims (18)
- - 18. The non-transitory machine-readable storage medium of claim 17, wherein:
    - ones of the energy values in the spectral data have a different ordinal position within the spectral data; and
      
      the operations further include;
      
      prior to creating the vector, weighting an energy value based on an ordinal position of its corresponding frequency in the spectral data.

19. A system comprising:
- one or more processors; and
  
  a memory storing instructions that, when executed by at least one processor among the one or more processors, cause the system to perform operations including;
  
  accessing spectral data stored in a database, the spectral data being derived from audio data and indicating a separate energy value for ones of a plurality of frequencies;
  
  determining from the spectral data, a first group of frequencies and a second group of frequencies in the plurality of frequencies, the first group including frequencies that are higher than frequencies in the second group of frequencies;
  
  in the first group of frequencies, identifying a first subgroup of frequencies wherein frequencies in the first subgroup have energy values that are higher than energy values of other frequencies in the first group;
  
  in the second group of frequencies, identifying a second subgroup of frequencies wherein frequencies in the second subgroup have energy values that are higher than energy values of other frequencies in the second group;
  
  creating a vector that assigns a first common value to frequencies in the first subgroup and assigns a second common value to frequencies in the second subgroup;
  
  generating a sequence of permutations of the vector, the permutations differently arranging instances of the first and second common values;
  
  generating a sequence of numbers that indicate a position of an instance of the first common value or of the second common value within a corresponding permutation among the permutations; and
  
  generating a fingerprint of the audio data based on the sequence of numbers to reduce a computational overhead.
- View Dependent Claims (20)
- - 20. The system of claim 19, wherein:
    - the fingerprint of the audio data is a first reference fingerprint of a first reference audio segment that precedes a second reference audio segment within reference media data; and
      
      the operations further include;
      
      generating a second reference fingerprint of the second reference audio segment;
      
      generating a first candidate fingerprint of a first candidate audio segment within candidate media data;
      
      generating a second candidate fingerprint of a second candidate audio segment that follows the first candidate audio segment within the candidate media data; and
      
      determining a likelihood that the candidate media data matches the reference media data, the likelihood being determined based on a first comparison of the first candidate fingerprint to the first reference fingerprint, based on a second comparison of the second candidate fingerprint to the second reference fingerprint, and based on the first reference audio segment preceding the second reference audio segment in conjunction with the first candidate audio segment preceding the second candidate audio segment.

21. A method of identifying an unknown audio item represented by audio data, the method comprising:
- determining, by one or more processors executing an instruction with at least one processor, spectral data from the audio data, the spectral data indicating a separate energy value for ones of a plurality of frequencies;
  
  identifying, by executing an instruction with the at least one processor, from the spectral data, a first group of frequencies and a second group of frequencies in the plurality of frequencies, the first group of frequencies including frequencies that are higher than frequencies in the second group are frequencies;
  
  in the first group of frequencies, identifying, by executing an instruction with the at least one processor, a first subgroup of frequencies wherein frequencies in the first subgroup have energy values that are higher than energy values of other frequencies in the first group;
  
  in the second group of frequencies, identifying, by executing an instruction with the at least one processor, a second subgroup of frequencies wherein frequencies in the second subgroup have energy values that are higher than energy values of other frequencies in the second group;
  
  creating, by executing an instruction with the at least one processor, a vector that assigns a first common value to frequencies in the first subgroup and assigns a second common value to frequencies in the second subgroup;
  
  generating, by executing an instruction with the at least one processor, a sequence of permutations of the vector, the permutations differently arranging instances of the first and second common values;
  
  generating, by executing an instruction with the at least one processor, a sequence of numbers that indicate a position of an instance of the first common value or of the second common value within a corresponding permutation among the permutations; and
  
  reducing a computational overhead by generating, by executing an instruction with the at least one processor, a query fingerprint of the audio data based on the sequence of numbers.
- View Dependent Claims (22, 23)
- - 22. The method of claim 21, further comprising including:
    - searching a database of reference fingerprints; and
      
      identifying a match between the query fingerprint and a reference fingerprint among the reference fingerprints, the match being identified based on a comparison of the query fingerprint and the reference fingerprint.
  - 23. The method of claim 21, wherein:
    - each frequency in the spectral data has a different ordinal position within the spectral data; and
      
      the method further includes;
      
      prior to creating the vector, weighting ones of the energy values based on an ordinal position of its corresponding frequency in the spectral data.

24. A non-transitory machine-readable storage medium comprising instructions that, when executed by one or more processors of a machine, cause the machine to perform operations including:
- determining spectral data from audio data, the spectral data indicating a separate energy value for ones of a plurality of frequencies;
  
  identifying, from the spectral data, a first group of frequencies and a second group of frequencies in the plurality of frequencies, the first group of frequencies including frequencies that are higher than frequencies in the second group are frequencies;
  
  in the first group of frequencies, identifying a first subgroup of frequencies wherein frequencies in the first subgroup have energy values that are higher than energy values of other frequencies in the first group;
  
  in the second group of frequencies, identifying a second subgroup of frequencies wherein frequencies in the second subgroup have energy values that are higher than energy values of other frequencies in the second group;
  
  creating a vector that assigns a first common value to frequencies in the first subgroup and assigns a second common value to frequencies in the second subgroup;
  
  generating a sequence of permutations of the vector, the permutations differently arranging instances of the first and second common values;
  
  generating a sequence of numbers that indicate a position of an instance of the first common value or of the second common value within a corresponding permutation among the permutations; and
  
  reducing a computational overhead by generating a query fingerprint of the audio data based on the sequence of numbers.

25. A device comprising:
- one or more processors; and
  
  a memory storing instructions that, when executed by at least one processor among the one or more processors, cause the device to perform operations including;
  
  determining spectral data from audio data, the spectral data indicating a separate energy value for ones of a plurality of frequencies;
  
  identifying, from the spectral data, a first group of frequencies and a second group of frequencies in the plurality of frequencies, the first group of frequencies including frequencies that are higher than frequencies in the second group are frequencies;
  
  in the first group of frequencies, identifying a first subgroup of frequencies wherein frequencies in the first subgroup have energy values that are higher than energy values of other frequencies in the first group;
  
  in the second group of frequencies, identifying a second subgroup of frequencies wherein frequencies in the second subgroup have energy values that are higher than energy values of other frequencies in the second group;
  
  creating a vector that assigns a first common value to frequencies in the first subgroup and assigns a second common value to frequencies in the second subgroup;
  
  generating a sequence of permutations of the vector, the permutations differently arranging instances of the first and second common values;
  
  generating a sequence of numbers that indicate a position of an instance of the first common value or of the second common value within a corresponding permutation among the permutations; and
  
  reducing a computational overhead by generating a query fingerprint of the audio data based on the sequence of numbers.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Gracenote, Inc. (RR Donnelley & Sons Company)
Original Assignee
Gracenote, Inc. (RR Donnelley & Sons Company)
Inventors
Han, Jinyu, Coover, Robert
Primary Examiner(s)
McCord, Paul C

Application Number

US15/008,042
Publication Number

US 20160217799A1
Time in Patent Office

1,140 Days
Field of Search

700 94
US Class Current
CPC Class Codes

G10L 19/018 Audio watermarking, i.e. em...

Audio fingerprinting

First Claim

12 Assignments

0 Petitions

Accused Products

Abstract

Citations

25 Claims

Specification

Solutions

Use Cases

Quick Links

Audio fingerprinting

First Claim

12 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

25 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links