×

Identifying and suppressing interfering audio content

  • US 10,325,591 B1
  • Filed: 09/05/2014
  • Issued: 06/18/2019
  • Est. Priority Date: 09/05/2014
  • Status: Active Grant
First Claim
Patent Images

1. A speech-based system, comprising:

  • one or more microphones configured to produce;

    a first input audio signal containing user speech and an interfering sound from a media content item played by a media player, the media player and the user in proximity to the speech-based system and the user speech including at least one spoken command for the speech-based system; and

    a second input audio signal containing the user speech and the interfering sound from the media content item played by the media player;

    one or more processors;

    non-transitory computer-readable storage media maintaining instructions executable by the one or more processors to perform operations comprising;

    selecting the first input audio signal as a first directional audio signal corresponding to a direction of a source of the user speech;

    selecting the second input audio signal as a second directional audio signal corresponding to a direction other than the direction of the source of the user speech based at least in part on a directional audio signal corresponding in direction to a known position of the media player;

    analyzing the second input audio signal to determine at least one characteristic of content of the second input audio signal;

    requesting an identity of a player content item being currently played by the media player and a temporal point within the player content item that is currently being output by the media player;

    generating an audio signature representative of the interfering sound based at least in part on the at least one characteristic of the content of the second input audio signal;

    identifying a plurality of media content items that are currently accessible to the media player;

    selecting a particular media content item of the plurality of media content items based at least in part on the audio signature, the identity of the player content item, the temporal point, and a reference audio signature that corresponds to the particular media content item;

    receiving at least a portion of the particular media content item that corresponds to the interfering sound from a reference content source; and

    processing the first input audio signal to suppress the interfering sound based at least in part on the at least the portion of the particular media content item by subtracting the portion of the particular media content item from the first input audio in order to obtain an interference-suppressed speech; and

    sending the interference-suppressed speech to a remote service for performing automatic speech recognition and natural language understanding on the interference-suppressed speech in order to determine an intent to perform or initiate functions or services expressed by the spoken command.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×