AUDIO FINGERPRINTING

US 20160217799A1
Filed: 01/27/2016
Published: 07/28/2016
Est. Priority Date: 12/16/2013
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

accessing, by one or more processors, spectral data stored in a database, the spectral data being derived from audio data and indicating a separate energy value for each of a plurality of frequencies;

determining, by the one or more processors, from the spectral data, a first group of frequencies and a second group of frequencies in the plurality of frequencies, the first group including frequencies that are higher than frequencies in the second group of frequencies;

in the first group of frequencies, identifying a first subgroup of frequencies wherein frequencies in the first subgroup have energy values that are higher than energy values of other frequencies in the first group;

in the second group of frequencies, identifying a second subgroup of frequencies wherein frequencies in the second subgroup have energy values that are higher than energy values of other frequencies in the second group;

creating, by the one or more processors, a vector that assigns a first common value to frequencies in the first subgroup and assigns a second common value to frequencies in the second subgroup;

generating, by the one or more processors, a sequence of permutations of the vector, each permutation differently arranging instances of the first and second common values;

generating, by the one or more processors, a sequence of numbers that each indicate a position of an instance of the first common value or of the second common value within a corresponding permutation among the permutations; and

generating, by the one or more processors, a fingerprint of the audio data based on the sequence of numbers.

View all claims

12 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A machine may be configured to generate one or more audio fingerprints of one or more segments of audio data. The machine may access audio data to be fingerprinted and divide the audio data into segments. For any given segment, the machine may generate a spectral representation from the segment; generate a vector from the spectral representation; generate an ordered set of permutations of the vector; generate an ordered set of numbers from the permutations of the vector; and generate a fingerprint of the segment of the audio data, which may be considered a sub-fingerprint of the audio data. In addition, the machine or a separate device may be configured to determine a likelihood that candidate audio data matches reference audio data.

Citations

25 Claims

1. A method comprising:
- accessing, by one or more processors, spectral data stored in a database, the spectral data being derived from audio data and indicating a separate energy value for each of a plurality of frequencies;
  
  determining, by the one or more processors, from the spectral data, a first group of frequencies and a second group of frequencies in the plurality of frequencies, the first group including frequencies that are higher than frequencies in the second group of frequencies;
  
  in the first group of frequencies, identifying a first subgroup of frequencies wherein frequencies in the first subgroup have energy values that are higher than energy values of other frequencies in the first group;
  
  in the second group of frequencies, identifying a second subgroup of frequencies wherein frequencies in the second subgroup have energy values that are higher than energy values of other frequencies in the second group;
  
  creating, by the one or more processors, a vector that assigns a first common value to frequencies in the first subgroup and assigns a second common value to frequencies in the second subgroup;
  
  generating, by the one or more processors, a sequence of permutations of the vector, each permutation differently arranging instances of the first and second common values;
  
  generating, by the one or more processors, a sequence of numbers that each indicate a position of an instance of the first common value or of the second common value within a corresponding permutation among the permutations; and
  
  generating, by the one or more processors, a fingerprint of the audio data based on the sequence of numbers.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
- - 2. The method of claim 1, wherein:
    - the first common value and the second common value are equal to a shared common value; and
      
      the creating of the vector assigns the shared common value to frequencies in the first and second subgroups of frequencies.
  - 3. The method of claim 1, wherein:
    - the creating of the vector creates a binary vector; and
      
      the first common value and the second common value are equal to unity.
  - 4. The method of claim 1, wherein:
    - each frequency in the spectral data has a different ordinal position within the spectral data; and
      
      the method further comprises;
      
      prior to creating the vector, weighting each energy value based on the ordinal position of its corresponding frequency in the spectral data.
  - 5. The method of claim 4, wherein:
    - the weighting of each energy value includes, for each energy value, multiplying the energy value by a corresponding weight factor that indicates the ordinal position of its corresponding frequency in the spectral data.
  - 6. The method of claim 5, wherein:
    - for each energy value, the corresponding weight factor is a square root of the ordinal position of the corresponding frequency.
  - 7. The method of claim 1, wherein:
    - the identifying of the first subgroup of frequencies is based on ranked energy values for the first group of frequencies; and
      
      the identifying of the second subgroup of frequencies is based on ranked energy values for the second group of frequencies.
  - 8. The method of claim 1, wherein:
    - the identifying of the first subgroup of frequencies includes ranking energy values for the first group of frequencies; and
      
      the identifying of the second subgroup of frequencies includes ranking energy values for the second group of frequencies.
  - 9. The method of claim 1, wherein:
    - the identifying of the first subgroup of frequencies includes, within the first group of frequencies, identifying frequencies whose energy values are within 0.5% of a maximum energy value for frequencies in the first group; and
      
      the identifying of the second subgroup of frequencies includes, within the second group of frequencies, identifying frequencies whose energy values are within 0.5% of a maximum energy value for frequencies in the second group.
  - 10. The method of claim 1, wherein:
    - the generating of the sequence of permutations generates an ordered plurality of unique permutations that each arrange the vector differently.
  - 11. The method of claim 1, wherein:
    - the generating of the sequence of numbers generates each number based on a lowest position of any instance of the first or second common value in the corresponding permutation.
  - 12. The method of claim 1, wherein:
    - the generating of the sequence of numbers generates each number by calculating a remainder from a modulo operation performed on a numerical representation of the lowest position occupied by any instance of the first or second common values in the corresponding permutation.
  - 13. The method of claim 1, wherein:
    - the generating of the fingerprint of the audio data includes storing the sequence of numbers with a timestamp that indicates the audio data being fingerprinted.
  - 14. The method of claim 1, wherein:
    - the generating of the fingerprint of the audio data includes storing each of multiple portions of the sequence of numbers in a different corresponding hash table among multiple hash tables that correspond to a timestamp that indicates the audio data being fingerprinted.
  - 15. The method of claim 1, wherein:
    - the fingerprint of the audio data is a first reference fingerprint of a first reference audio segment that precedes a second reference audio segment within reference media data; and
      
      the method further comprises;
      
      generating a second reference fingerprint of the second reference audio segment;
      
      generating a first candidate fingerprint of a first candidate audio segment within candidate media data;
      
      generating a second candidate fingerprint of a second candidate audio segment that follows the first candidate audio segment within the candidate media data; and
      
      determining a likelihood that the candidate media data matches the reference media data, the likelihood being determined based on a first comparison of the first candidate fingerprint to the first reference fingerprint, based on a second comparison of the second candidate fingerprint to the second reference fingerprint, and based on the first reference audio segment preceding the second reference audio segment in conjunction with the first candidate audio segment preceding the second candidate audio segment.
  - 16. The method of claim 1, wherein:
    - determining that the first reference audio segment precedes the second reference audio segment by a time span by which the first candidate audio segment precedes the second candidate audio segment; and
      
      the determining of the likelihood that the candidate media data matches the reference media data is based on the first reference audio segment preceding the second reference audio segment by the time span by which the first candidate audio segment precedes the second candidate audio segment.

17. A non-transitory machine-readable storage medium comprising instructions that, when executed by one or more processors of a machine, cause the machine to perform operations comprising:
- accessing spectral data stored in a database, the spectral data being derived from audio data and indicating a separate energy value for each of a plurality of frequencies;
  
  determining, from the spectral data, a first group of frequencies and a second group of frequencies in the plurality of frequencies, the first group including frequencies that are higher than frequencies in the second group of frequencies;
  
  in the first group of frequencies, identifying a first subgroup of frequencies wherein frequencies in the first subgroup have energy values that are higher than energy values of other frequencies in the first group;
  
  in the second group of frequencies, identifying a second subgroup of frequencies wherein frequencies in the second subgroup have energy values that are higher than energy values of other frequencies in the second group;
  
  creating a vector that assigns a first common value to frequencies in the first subgroup and assigns a second common value to frequencies in the second subgroup;
  
  generating a sequence of permutations of the vector, each permutation differently arranging instances of the first and second common values;
  
  generating a sequence of numbers that each indicate a position of an instance of the first common value or of the second common value within a corresponding permutation among the permutations; and
  
  generating a fingerprint of the audio data based on the sequence of numbers.
- View Dependent Claims (18)
- - 18. The non-transitory machine-readable storage medium of claim 17, wherein:
    - each frequency in the spectral data has a different ordinal position within the spectral data; and
      
      the operations further comprise;
      
      prior to creating the vector, weighting each energy value based on the ordinal position of its corresponding frequency in the spectral data.

19. A system comprising:
- one or more processors; and
  
  a memory storing instructions that, when executed by at least one processor among the one or more processors, cause the system to perform operations comprising;
  
  accessing spectral data stored in a database, the spectral data being derived from audio data and indicating a separate energy value for each of a plurality of frequencies;
  
  determining from the spectral data, a first group of frequencies and a second group of frequencies in the plurality of frequencies, the first group including frequencies that are higher than frequencies in the second group of frequencies;
  
  in the first group of frequencies, identifying a first subgroup of frequencies wherein frequencies in the first subgroup have energy values that are higher than energy values of other frequencies in the first group;
  
  in the second group of frequencies, identifying a second subgroup of frequencies wherein frequencies in the second subgroup have energy values that are higher than energy values of other frequencies in the second group;
  
  creating a vector that assigns a first common value to frequencies in the first subgroup and assigns a second common value to frequencies in the second subgroup;
  
  generating a sequence of permutations of the vector, each permutation differently arranging instances of the first and second common values;
  
  generating a sequence of numbers that each indicate a position of an instance of the first common value or of the second common value within a corresponding permutation among the permutations; and
  
  generating a fingerprint of the audio data based on the sequence of numbers.
- View Dependent Claims (20)
- - 20. The system of claim 19, wherein:
    - the fingerprint of the audio data is a first reference fingerprint of a first reference audio segment that precedes a second reference audio segment within reference media data; and
      
      the method further comprises;
      
      generating a second reference fingerprint of the second reference audio segment;
      
      generating a first candidate fingerprint of a first candidate audio segment within candidate media data;
      
      generating a second candidate fingerprint of a second candidate audio segment that follows the first candidate audio segment within the candidate media data; and
      
      determining a likelihood that the candidate media data matches the reference media data, the likelihood being determined based on a first comparison of the first candidate fingerprint to the first reference fingerprint, based on a second comparison of the second candidate fingerprint to the second reference fingerprint, and based on the first reference audio segment preceding the second reference audio segment in conjunction with the first candidate audio segment preceding the second candidate audio segment.

21. A method of identifying an unknown audio item represented by audio data, the method comprising:
- determining, by one or more processors, spectral data from the audio data, the spectral data indicating a separate energy value for each of a plurality of frequencies;
  
  identifying, from the spectral data, a first group of frequencies and a second group of frequencies in the plurality of frequencies, the first group of frequencies including frequencies that are higher than frequencies in the second group are frequencies;
  
  in the first group of frequencies, identifying a first subgroup of frequencies wherein frequencies in the first subgroup have energy values that are higher than energy values of other frequencies in the first group;
  
  in the second group of frequencies, identifying a second subgroup of frequencies wherein frequencies in the second subgroup have energy values that are higher than energy values of other frequencies in the second group;
  
  creating, by the one or more processors, a vector that assigns a first common value to frequencies in the first subgroup and assigns a second common value to frequencies in the second subgroup;
  
  generating, by the one or more processors, a sequence of permutations of the vector, each permutation differently arranging instances of the first and second common values;
  
  generating, by the one or more processors, a sequence of numbers that each indicate a position of an instance of the first common value or of the second common value within a corresponding permutation among the permutations; and
  
  generating, by the one or more processors, a query fingerprint of the audio data based on the sequence of numbers.
- View Dependent Claims (22, 23)
- - 22. The method of claim 21, further comprising:
    - searching a database of reference fingerprints; and
      
      identifying a match between the query fingerprint and a reference fingerprint among the reference fingerprints, the match being identified based on a comparison of the query fingerprint and the reference fingerprint.
  - 23. The method of claim 21, wherein:
    - each frequency in the spectral data has a different ordinal position within the spectral data; and
      
      the method further comprises;
      
      prior to creating the vector, weighting each energy value based on the ordinal position of its corresponding frequency in the spectral data.

24. A non-transitory machine-readable storage medium comprising instructions that, when executed by one or more processors of a machine, cause the machine to perform operations comprising:
- determining, by one or more processors, spectral data from the audio data, the spectral data indicating a separate energy value for each of a plurality of frequencies;
  
  identifying, from the spectral data, a first group of frequencies and a second group of frequencies in the plurality of frequencies, the first group of frequencies including frequencies that are higher than frequencies in the second group are frequencies;
  
  in the first group of frequencies, identifying a first subgroup of frequencies wherein frequencies in the first subgroup have energy values that are higher than energy values of other frequencies in the first group;
  
  in the second group of frequencies, identifying a second subgroup of frequencies wherein frequencies in the second subgroup have energy values that are higher than energy values of other frequencies in the second group;
  
  creating, by the one or more processors, a vector that assigns a first common value to frequencies in the first subgroup and assigns a second common value to frequencies in the second subgroup;
  
  generating, by the one or more processors, a sequence of permutations of the vector, each permutation differently arranging instances of the first and second common values;
  
  generating, by the one or more processors, a sequence of numbers that each indicate a position of an instance of the first common value or of the second common value within a corresponding permutation among the permutations; and
  
  generating, by the one or more processors, a query fingerprint of the audio data based on the sequence of numbers.

25. A device comprising:
- one or more processors; and
  
  a memory storing instructions that, when executed by at least one processor among the one or more processors, cause the device to perform operations comprising;
  
  determining, by one or more processors, spectral data from the audio data, the spectral data indicating a separate energy value for each of a plurality of frequencies;
  
  identifying, from the spectral data, a first group of frequencies and a second group of frequencies in the plurality of frequencies, the first group of frequencies including frequencies that are higher than frequencies in the second group are frequencies;
  
  in the first group of frequencies, identifying a first subgroup of frequencies wherein frequencies in the first subgroup have energy values that are higher than energy values of other frequencies in the first group;
  
  in the second group of frequencies, identifying a second subgroup of frequencies wherein frequencies in the second subgroup have energy values that are higher than energy values of other frequencies in the second group;
  
  creating, by the one or more processors, a vector that assigns a first common value to frequencies in the first subgroup and assigns a second common value to frequencies in the second subgroup;
  
  generating, by the one or more processors, a sequence of permutations of the vector, each permutation differently arranging instances of the first and second common values;
  
  generating, by the one or more processors, a sequence of numbers that each indicate a position of an instance of the first common value or of the second common value within a corresponding permutation among the permutations; and
  
  generating, by the one or more processors, a query fingerprint of the audio data based on the sequence of numbers.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Gracenote, Inc. (RR Donnelley & Sons Company)
Original Assignee
Gracenote, Inc. (RR Donnelley & Sons Company)
Inventors
Han, Jinyu, Coover, Robert

Granted Patent

US 10,229,689 B2
Time in Patent Office

Days
Field of Search
US Class Current

1/1
CPC Class Codes

G10L 19/018 Audio watermarking, i.e. em...

AUDIO FINGERPRINTING

First Claim

12 Assignments

0 Petitions

Accused Products

Abstract

Citations

25 Claims

Specification

Solutions

Use Cases

Quick Links

AUDIO FINGERPRINTING

First Claim

12 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

25 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links