System and methods for continuous audio matching

US 10,055,490 B2
Filed: 06/14/2016
Issued: 08/21/2018
Est. Priority Date: 07/29/2010
Status: Active Grant

First Claim

Patent Images

1. A non-transitory computer readable medium storing code that, when executed by one or more processors, causes the one or more processors to:

send an audio query to a server;

responsive to the server matching the audio query with a reference item in a database, receive, from the server, an audio fingerprint sequence and an audio identifier associated with a predicted reference audio item;

update a watching cache with the audio fingerprint sequence and the associated audio identifier;

extract an input audio fingerprint from an audio signal; and

match the input audio fingerprint extracted from the audio signal to the audio fingerprint sequence stored in the watching cache and associated with the predicted reference audio item to identify the predicted reference audio item from the audio signal.

View all claims

8 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present invention relates to the continuous monitoring of an audio signal and identification of audio items within an audio signal. The technology disclosed utilizes predictive caching of fingerprints to improve efficiency. Fingerprints are cached for tracking an audio signal with known alignment and for watching an audio signal without known alignment, based on already identified fingerprints extracted from the audio signal. Software running on a smart phone or other battery-powered device cooperates with software running on an audio identification server.

225 Citations

10 Claims

1. A non-transitory computer readable medium storing code that, when executed by one or more processors, causes the one or more processors to:
- send an audio query to a server;
  
  responsive to the server matching the audio query with a reference item in a database, receive, from the server, an audio fingerprint sequence and an audio identifier associated with a predicted reference audio item;
  
  update a watching cache with the audio fingerprint sequence and the associated audio identifier;
  
  extract an input audio fingerprint from an audio signal; and
  
  match the input audio fingerprint extracted from the audio signal to the audio fingerprint sequence stored in the watching cache and associated with the predicted reference audio item to identify the predicted reference audio item from the audio signal.
- View Dependent Claims (2)
- - 2. The non-transitory computer readable medium of claim 1 further comprising code that, when executed by one or more processors, causes the one or more processors to:
    - responsive to the server matching the audio query with the reference item in the database, receive, from the server, a targeted ad related to audio content that a user is experiencing; and
      
      responsive to a failure of the matching of the input audio fingerprint to the audio fingerprint sequence, placing the targeted ad to the user on a device user interface.

3. A non-transitory computer readable medium storing code that, when executed by one or more processors, causes the one or more processors to:
- receive a plurality of reference audio fingerprint sequences into a tracking cache;
  
  select, from the plurality of received reference audio fingerprint sequences, a first candidate reference audio fingerprint sequence as a first potential match to an audio signal;
  
  select, from the plurality of received reference audio fingerprint sequences, a second candidate reference audio fingerprint sequence as a second potential match to the audio signal;
  
  maintain a first tracking alignment between a fingerprint sequence extracted from the audio signal and the first candidate reference audio fingerprint sequence;
  
  maintain a second tracking alignment between the fingerprint sequence extracted from the audio signal and the second candidate reference audio fingerprint sequence; and
  
  responsive to a failure of the first tracking alignment, resolving ambiguity by confirming that the audio signal comprises the second candidate reference audio fingerprint sequence.
- View Dependent Claims (4)
- - 4. The non-transitory computer readable medium of claim 3 further comprising code that, when executed by one or more processors, causes the one or more processors to:
    - perform a readjustment of the alignment between the fingerprint sequence extracted from the audio signal and the first candidate reference audio fingerprint sequence.

5. A method of using a user device to monitor an audio signal and identify audio items within the audio signal, the method including:
- responsive to the user device having sent initial audio fingerprints extracted from the audio signal, identifying an initial audio item in the initial audio fingerprints;
  
  responsive to the identification of the initial audio item, (i) updating a cache with one or more audio fingerprint sequences received from a server, the one or more audio fingerprint sequences being from one or more audio items predicted to follow the identified initial audio item, and (ii) updating the cache with respective audio item identifiers for the one or more audio items predicted to follow the identified initial audio item; and
  
  matching additional audio fingerprints extracted from the audio signal to the cached one or more audio fingerprint sequences from the one or more audio items predicted to follow the identified initial audio item, to identify an audio item within the audio signal as one of the one or more audio items predicted to follow the identified initial audio item.
- View Dependent Claims (6, 7, 8, 9, 10)
- - 6. The method of claim 5, where the one or more audio fingerprint sequences from the one or more audio items are stored in a local cache on the user device.
  - 7. The method of claim 5, wherein the one or more audio fingerprint sequences from the one or more audio items are stored in a local cache on the server.
  - 8. The method of claim 5 wherein the one or more audio items are predicted to follow the identified initial audio item based on an observed sequence using previously identified sequences of songs in a multiplicity of audio items.
  - 9. The method of claim 8, wherein the identified initial audio item is a first song, and the one or more audio items predicted to follow the identified initial audio item includes one or more songs on a same album as the first song.
  - 10. The method of claim 8, wherein the identified initial audio item is a first song, and the one or more audio items predicted to follow the identified initial audio item includes one or more songs on a known playlist including the first song.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Soundhound AI IP LLC
Original Assignee
SoundHound Incorporated
Inventors
Mont-Reynaud, Bernard, Master, Aaron, Stonehocker, Timothy, Mohajer, Keyvan
Primary Examiner(s)
Tsang, Fan
Assistant Examiner(s)
Siegel, David

Application Number

US15/182,300
Publication Number

US 20160292266A1
Time in Patent Office

798 Days
Field of Search
US Class Current
CPC Class Codes

G06F 16/433   using audio data

G06F 16/639   using playlists

G06F 16/68   Retrieval characterised by ...

G06F 16/683   using metadata automaticall...

System and methods for continuous audio matching

First Claim

8 Assignments

0 Petitions

Accused Products

Abstract

225 Citations

10 Claims

Specification

Solutions

Use Cases

Quick Links

System and methods for continuous audio matching

First Claim

8 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

225 Citations

10 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links