×

Detection of target and non-target users using multi-session information

  • US 9,837,080 B2
  • Filed: 08/21/2014
  • Issued: 12/05/2017
  • Est. Priority Date: 08/21/2014
  • Status: Active Grant
First Claim
Patent Images

1. A method for maintaining speaker recognition performance, comprising:

  • training a plurality of models respectively corresponding to speaker recognition scores from a plurality of speakers over a plurality of sessions;

    receiving a voice signal of a speaker seeking access to an environment via at least one network;

    extracting one or more speech statistics of the voice signal for determining a speaker recognition score of the speaker seeking access;

    using the plurality of models to conclude whether the speaker seeking access is a non-ideal target speaker that is authorized to access the environment, but provides a voice signal which yields a speaker recognition score that results in a failure to recognize the non-ideal target speaker as being authorized to access the environment, and prevents access to the environment, or a non-ideal non-target speaker that is not authorized to access the environment, but provides a voice signal which yields a speaker recognition score that results in a misidentification of the non-ideal non-target speaker as being authorized to access the environment, and allows access to the environment, wherein using the plurality of models to conclude comprises;

    calculating a first probability that the speaker seeking access is the non-ideal target speaker;

    calculating a second probability that the speaker seeking access is the non-ideal non-target speaker; and

    determining whether the first probability, the second probability or a sum of the first probability and the second probability is above a probability threshold; and

    restricting the speaker seeking access from accessing the environment upon determining that the first probability, second probability or the sum of the first probability and the second probability is above the probability threshold;

    wherein the plurality of speakers comprise known non-ideal target speakers and known non-ideal non-target speakers;

    wherein the known non-ideal target speakers comprise authorized speakers each having a right to access the environment and yielding respective first speaker recognition scores within a predetermined value below a speaker recognition threshold that prevent access to the environment;

    wherein the known non-ideal non-target speakers comprise unauthorized speakers each not having a right to access the environment and yielding respective second speaker recognition scores within a predetermined value above the speaker recognition threshold that allow access to the environment;

    wherein the plurality of speakers further comprise ideal target speakers and ideal non-target speakers;

    wherein the ideal target speakers comprise authorized speakers each having a right to access the environment and yielding respective third speaker recognition scores greater than the predetermined value above the speaker recognition threshold that allow access to the environment;

    wherein the ideal non-target speakers comprise unauthorized speakers each not having a right to access the environment and yielding respective fourth speaker recognition scores less than the predetermined value below the speaker recognition threshold that prevent access to the environment; and

    wherein the training, receiving, extracting, using and determining steps are performed by a computer system comprising a memory and at least one processor coupled to the memory.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×