×

Using audio characteristics to identify speakers and media items

  • US 10,140,991 B2
  • Filed: 06/16/2017
  • Issued: 11/27/2018
  • Est. Priority Date: 11/04/2013
  • Status: Active Grant
First Claim
Patent Images

1. A method performed by one or more computers, the method comprising:

  • receiving, by the one or more computers, a request from a client device for media content, the request including at least a portion of a first media item or a URL corresponding to the first media item, the first media item including speech of a person;

    based on the data indicating the first media item, selecting, by the one or more computers, one or more other media items based on one or more representations of acoustic characteristics of the one or more other media items,wherein the one or more representations of acoustic characteristics of the one or more other media items comprise, for each of the one or more other media items, a speaker representation that includes (i) an i-vector or d-vector generated from the other media item, or (ii) a hash of an i-vector or d-vector generated from the other media item;

    wherein each of the one or more other media items is selected based on a comparison of (i) an i-vector, d-vector or hash determined from speech in the first media item with (ii) the speaker representation for the other media item,wherein;

    each of the selected one or more other media items is different from the first media item;

    each of the selected one or more other media items includes speech of the same person whose speech is included in the first media item; and

    each of the selected one or more other media items is determined, based on the acoustic characteristics of the media item, to include speech demonstrating speaker characteristics that have at least a threshold level of similarity with speaker characteristics determined from speech in the first media item;

    generating, by the one or more computers, data indicating the selected one or more other media items that are each different from the first media item and that each include speech of the same person whose speech is included in the first media item; and

    providing, by the one or more computers and to the client device, a response to the request that includes the data indicating the selected one or more other media items that are each different from the first media item and that each include speech of the same person whose speech is included in the first media item.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×