SPEAKER RECOGNITION IN MULTIMEDIA SYSTEM

US 20170372706A1
Filed: 02/10/2016
Published: 12/28/2017
Est. Priority Date: 02/11/2015
Status: Active Grant

First Claim

Patent Images

1. A method for identifying a user among a plurality of users of a multimedia system including one or more devices for providing multimedia content from one or more sources of digital information, in order to provide individually adjusted access and control of multimedia content from the multimedia system, the method comprising the steps of:

providing a collection of i-vector sets, each i-vector set including i-vectors based on one or more words spoken by a user of the multimedia system and being associated with an access profile of this user,acquiring a speech utterance from a current user, and extracting an i-vector for the speech utterance using total variability modeling,comparing the extracted i-vector with each i-vector set in the collection, in order to identify a target set most similar to the extracted i-vector,granting, to the current user, access to the multimedia system in accordance with the access profile associated with the identified target set,wherein the speech utterance is acquired using one of a plurality of sources, and wherein the method further comprises minimizing source variation in the total variability modeling by;

for each data source, estimating a source-specific informative prior, which is defined by a mean and a covariance, andfor each speech utterance acquired using a specific data source, re-centering first-order statistics of the speech utterance around the mean of the informative prior associated with the source, and using the co-variance of the informative prior associated with the source when extracting the i-vector for the speech utterance.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for identifying a user among a plurality of users of a multimedia system comprising extracting an i-vector for the speech utterance using total variability modeling, comparing the extracted i-vector with a collection of i-vector sets in order to identify a target set most similar to the extracted i-vector, and granting access to the multimedia system in accordance with an access profile associated with the identified target set. Further, source variation is minimized by, for each speech utterance acquired using a specific data source, re-centering first-order statistics of the speech utterance around the mean of an informative prior associated with the source, and using the co-variance of the informative prior associated with the source when extracting the i-vector for the speech utterance.

24 Citations

View as Search Results

15 Claims

1. A method for identifying a user among a plurality of users of a multimedia system including one or more devices for providing multimedia content from one or more sources of digital information, in order to provide individually adjusted access and control of multimedia content from the multimedia system, the method comprising the steps of:
- providing a collection of i-vector sets, each i-vector set including i-vectors based on one or more words spoken by a user of the multimedia system and being associated with an access profile of this user,acquiring a speech utterance from a current user, and extracting an i-vector for the speech utterance using total variability modeling,comparing the extracted i-vector with each i-vector set in the collection, in order to identify a target set most similar to the extracted i-vector,granting, to the current user, access to the multimedia system in accordance with the access profile associated with the identified target set,wherein the speech utterance is acquired using one of a plurality of sources, and wherein the method further comprises minimizing source variation in the total variability modeling by;
  
  for each data source, estimating a source-specific informative prior, which is defined by a mean and a covariance, andfor each speech utterance acquired using a specific data source, re-centering first-order statistics of the speech utterance around the mean of the informative prior associated with the source, and using the co-variance of the informative prior associated with the source when extracting the i-vector for the speech utterance.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15)
- - 2. The method according to claim 1, wherein estimating a source-specific informative prior includes:
    - extracting a source specific set of i-vectors from data acquired from the data source, andusing the source specific set of i-vectors to estimate the source-specific informative prior.
  - 3. The method according to claim 2, wherein extracting a source specific set of i-vectors is done using a pre-trained total variability matrix and a non-informative prior.
  - 4. The method according to claim 2, wherein extracting a source specific set of i-vectors is done using an informative total variability matrix and a non-informative prior, and wherein the informative total variability matrix is computed by:
    - performing a plurality of training iterations, e.g. expectation maximization training iterations, each iteration including computing a preliminary source-specific informative prior and updating the informative total variability matrix using the preliminary source-specific informative prior.
  - 5. The method according to claim 1, further comprising storing the collection of i-vector sets and associated access profiles in a remote database and making them accessible to more than one multimedia system.
  - 6. The method according to claim 5, further comprising storing content consumption patterns of each user and providing the current user with recommendations based on choices of other users with similar choices as the current user.
  - 7. The method according to claim 1, further comprising:
    - providing a collection of i-vector classes, each i-vector class including a set of i-vectors based on speech from users having similar characteristics, andcomparing the extracted i-vector with each i-vector class to identify an i-vector class most similar to the extracted i-vector.
  - 8. The method according to claim 7, wherein the characteristics include at least one of age, gender, and mood.
  - 9. The method according to claim 1, further including identifying and registering a new user only if an i-vector extracted from a speech utterance of the new user is sufficiently different from all previously stored i-vectors according to a predefined condition.
  - 10. The method according claim 9, wherein the condition is based on a cosine distance between the extracted i-vector and all previously stored i-vectors.
  - 11. The method according to claim 1, wherein the collection of i-vector sets includes a first i-vector set based one or more words spoken by a first user and associated with a first access profile, and a second i-vector set based on one or more words spoken by a second user and associated with a second access profile, and further comprising:
    - allocating a first user identification to the first user;
      
      allocating a second user identification to the second user;
      
      identifying the first user as the current user;
      
      receiving input from the first user indicating the second user identification; and
      
      granting the first user access in accordance with the second access profile.
  - 12. The method according to claim 11, wherein each access profile defines user dependent access rights.
  - 13. The method according to claim 11, wherein each user identification is allocated to a function key, such as a button on a physical device or a graphical image/icon on a virtual device.
  - 15. The method according to claim 13, wherein said database is remote to said multimedia system, and shared by several multimedia systems.

14. A multimedia system comprising:
- one or more sources of digital information,one or more devices for providing multimedia content from the sources,a database storing a collection of i-vector sets, each i-vector set including i-vectors based on one or more words spoken by a user of the multimedia system and being associated with an access profile of this user,a plurality of speech recording data sources,processing circuitry configured to;
  
  extract an i-vector for a speech utterance acquired from one of said data sources using total variability modeling, while minimizing source variation by;
  
  for each data source, estimating a source-specific informative prior, which is defined by a mean and a covariance, andfor each speech utterance acquired using a specific data source, re-centering first-order statistics of the speech utterance around the mean of the informative prior associated with the source, and using the co-variance of the informative prior associated with the source when extracting the i-vector for the speech utterancecompare the extracted i-vector with each i-vector set in the collection, in order to identify a target set most similar to the extracted i-vector, andgrant, to the current user, access to the multimedia system in accordance with the access profile associated with the identified target set.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Bang & Olufsen a/s
Original Assignee
Bang & Olufsen a/s
Inventors
Shepstone, Sven Ewan, Borup Jensen, Sren

Granted Patent

US 10,354,657 B2
Time in Patent Office

Days
Field of Search
US Class Current
CPC Class Codes

G10L 17/00   Speaker identification or v...

G10L 17/02   Preprocessing operations, e...

G10L 17/04   Training, enrolment or mode...

G10L 17/06   Decision making techniques;...

G10L 17/22   Interactive procedures; Man...

SPEAKER RECOGNITION IN MULTIMEDIA SYSTEM

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

24 Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

SPEAKER RECOGNITION IN MULTIMEDIA SYSTEM

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

24 Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links