×

Securely executing voice actions with speaker identification and authentication input types

  • US 10,127,926 B2
  • Filed: 06/10/2016
  • Issued: 11/13/2018
  • Est. Priority Date: 06/10/2016
  • Status: Active Grant
First Claim
Patent Images

1. A method performed by a voice action server, the method comprising:

  • receiving, by the voice action server, (i) audio data representing a voice command spoken by a speaker and (ii) contextual data from a client device of the speaker, the contextual data indicating a status of the client device and comprising data values representing contextual signals that can authenticate the speaker without requiring the speaker to provide explicit authentication information;

    identifying, by the voice action server, the speaker based on the audio data representing the voice command;

    selecting, by the voice action server, a voice action based at least on a transcription of the audio data;

    selecting, by the voice action server, a third-party service provider from among a plurality of different third-party service providers, wherein the third-party service provider is selected by obtaining a mapping of voice actions to the plurality of third-party service providers, the mapping indicating that the selected third-party service provider can perform the selected voice action, wherein the selected third-party service provider is configured to perform multiple voice actions, and wherein the selected third-party service provider requires different combinations of input data to perform authentication for at least some of the multiple voice actions;

    identifying, by the voice action server, one or more input authentication data types that the selected third-party service provider uses to perform authentication for the selected voice action, wherein the identified one or more input authentication data types for the selected action are different from one or more input authentication data types that the selected third-party service provider uses to perform authentication for at least one other voice action;

    obtaining, by the voice action server without requiring the speaker to provide explicit authentication information, one or more authentication data values representing contextual signals from the received contextual data that correspond to the identified one or more input authentication data types; and

    providing, to the third-party service provider by the voice action server over a network, (i) a request to perform the selected voice action and (ii) a speaker identification result determined based on the audio data representing the voice command, and (iii) the obtained one or more authentication data values from the received contextual data, wherein the speaker identification result and the one or more obtained authentication data values enable the selected third-party service provider to authenticate the speaker and perform the selected voice action.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×