Developer voice actions system

US 9,922,648 B2
Filed: 03/01/2016
Issued: 03/20/2018
Est. Priority Date: 03/01/2016
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method comprising:

receiving, by a voice action system, data defining a new voice action that does not currently exist for a software application installed on one or more devices, the software application being different from said voice action system, the data indicating one or more operations for the software application to perform the new voice action and one or more trigger terms for triggering the new voice action, wherein the data defining the new voice action specifies a context, the context specifying a status of a user device or of the software application installed on the user device;

generating, by the voice action system, a voice action passive data structure based at least on the data defining the new voice action, wherein the voice action passive data structure comprises data that, when received by the software application, causes the software application to perform the one or more operations to perform the new voice action;

associating, by the voice action system, the voice action passive data structure with the context and with the one or more trigger terms for triggering the new voice action, wherein multiple voice action passive data structures are defined in the voice action system;

receiving, by the voice action system, (i) user command utterance obtained by the user device, the user device having the software application installed, and (ii) current context information regarding the user device;

identifying, using the current context information and not a transcription of the user command utterance, a set of candidate voice action passive data structures from the multiple voice action passive data structures of the voice action system, the set of candidate voice action passive data structures including the voice action passive data structure defined by the data and being identified based on respective contexts associated with the set of candidate voice action passive data structures;

narrowing the identified set of candidate voice action passive data structures by comparing the transcription of the user command utterance with trigger terms of respective ones of the set of candidate voice action passive data structures;

determining, by the voice action system, that the transcription of the user command utterance corresponds to the one or more trigger terms associated with the voice action passive data structure; and

in response to the determination, providing, by the voice action system, the voice action passive data structure to the user device which is remote from the voice action system, thereby causing the software application installed on the user device to perform the one or more operations to perform the new voice action.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, systems, and apparatus for receiving, by a voice action system, data specifying a new voice action for an application different from the voice action system. A voice action intent for the application is generated based at least on the data, wherein the voice action intent comprises data that, when received by the application, requests that the application perform one or more operations specified for the new voice action. The voice action intent is associated with trigger terms specified for the new voice action. The voice action system is configured to receive an indication of a user utterance obtained by a device having the application installed, and determines that a transcription of the user utterance corresponds to the trigger terms associated with the voice action intent. In response to the determination, the voice action system provides the voice action intent to the device.

82 Citations

View as Search Results

21 Claims

1. A computer-implemented method comprising:
- receiving, by a voice action system, data defining a new voice action that does not currently exist for a software application installed on one or more devices, the software application being different from said voice action system, the data indicating one or more operations for the software application to perform the new voice action and one or more trigger terms for triggering the new voice action, wherein the data defining the new voice action specifies a context, the context specifying a status of a user device or of the software application installed on the user device;
  
  generating, by the voice action system, a voice action passive data structure based at least on the data defining the new voice action, wherein the voice action passive data structure comprises data that, when received by the software application, causes the software application to perform the one or more operations to perform the new voice action;
  
  associating, by the voice action system, the voice action passive data structure with the context and with the one or more trigger terms for triggering the new voice action, wherein multiple voice action passive data structures are defined in the voice action system;
  
  receiving, by the voice action system, (i) user command utterance obtained by the user device, the user device having the software application installed, and (ii) current context information regarding the user device;
  
  identifying, using the current context information and not a transcription of the user command utterance, a set of candidate voice action passive data structures from the multiple voice action passive data structures of the voice action system, the set of candidate voice action passive data structures including the voice action passive data structure defined by the data and being identified based on respective contexts associated with the set of candidate voice action passive data structures;
  
  narrowing the identified set of candidate voice action passive data structures by comparing the transcription of the user command utterance with trigger terms of respective ones of the set of candidate voice action passive data structures;
  
  determining, by the voice action system, that the transcription of the user command utterance corresponds to the one or more trigger terms associated with the voice action passive data structure; and
  
  in response to the determination, providing, by the voice action system, the voice action passive data structure to the user device which is remote from the voice action system, thereby causing the software application installed on the user device to perform the one or more operations to perform the new voice action.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
- - 2. The computer-implemented method of claim 1, wherein the new voice action is a voice-enabled command that the software application is not programmed to support.
  - 3. The computer-implemented method of claim 1, wherein receiving the data defining the new voice action comprises receiving the data from a developer who published the software application.
  - 4. The computer-implemented method of claim 1, wherein the voice action system does not receive the data defining the new voice action from the software application installed on the user device.
  - 5. The computer-implemented method of claim 1, wherein the context specifies that a specific activity that the software application is performing is in a particular activity state.
  - 6. The computer-implemented method of claim 1, comprising:
    - determining, by the voice action system, that the context information satisfies the context; and
      
      wherein in response to determining that the transcription of the user command utterance corresponds to the one or more trigger terms associated with the voice action passive data structure and that the context information satisfies the context, the voice action system provides the voice action passive data structure to the user device.
  - 7. The computer-implemented method of claim 6, wherein receiving the current context information indicating the status of the user device or of the software application installed on the user device comprises:
    - providing, by the voice action system to the user device, a request for particular context information; and
      
      receiving, by the voice action system, the particular current context information in response to the request.
  - 8. The computer-implemented method of claim 6, comprising:
    - determining, by the voice action system, that the current context information satisfies a context for a second voice action, and that the transcription of the user command utterance corresponds to one or more trigger terms associated with a voice action passive data structure for the second voice action, wherein the voice action passive data structure for the second voice action comprises data that, when received by a software application associated with the second voice action, causes the software application associated with the second voice action to perform one or more operations to perform the second voice action;
      
      in response to the determination, selecting, by the voice action system, a voice action from among the new voice action and the second voice action; and
      
      providing, by the voice action system, the voice action passive data structure associated with the selected voice action to the user device, thereby causing the software application installed on the user device to perform the one or more operations to perform the selected voice action.
  - 9. The computer-implemented method of claim 8, wherein selecting the selected voice action from among the new voice action and the second voice action comprises selecting the selected voice action in response to receiving data indicating a user selection of one of the new voice action or the second voice action.
  - 10. The computer-implemented method of claim 8, wherein selecting the selected voice action from among the new voice action and the second voice action comprises:
    - assigning a score to each of the new voice action and the second voice action; and
      
      selecting the selected voice action based at least on the score assigned to each of the new voice action and the second voice action.
  - 11. The computer-implemented method of claim 8, wherein selecting the selected voice action from among the new voice action and the second voice action comprises selecting the selected voice action in response to determining that the software application associated with the selected voice action is operating in the foreground.
  - 12. The computer-implemented method of claim 1, wherein generating the voice action passive data structure comprises determining that the one or more operations to perform the new voice action are capable of being performed by the software application.
  - 13. The computer-implemented method of claim 1, comprising:
    - determining, by the voice action system, that the transcription of the user command utterance is similar to the one or more trigger terms associated with the voice action passive data structure;
      
      in response to the determination, providing, by the voice action system to the user device, data indicating a request for user input that confirms whether the user command utterance corresponds to the one or more trigger terms associated with the voice action passive data structure or was intended to cause the software application to perform the new voice action;
      
      in response to the request, receiving, by the voice action system and from the user device, data indicating a confirmation; and
      
      in response to receiving the data indicating the confirmation, providing, by the voice action system, the voice action passive data structure to the user device, thereby causing the software application installed on the user device to perform the one or more operations to perform the new voice action.
  - 14. The computer-implemented method of claim 1, comprising:
    - receiving, by the voice action system, a request to deploy the new voice action; and
      
      deploying, by the voice action system, the new voice action in response to the request, wherein deploying the new voice action enables triggering of the new voice action.
  - 15. The computer-implemented method of claim 1, comprising:
    - receiving, by the voice action system, a request to rescind deployment of the new voice action; and
      
      rescinding, by the voice action system, deployment of the new voice action in response to the request, wherein rescinding deployment of the new voice action disables triggering of the new voice action.
  - 16. The computer-implemented method of claim 1, comprising:
    - receiving, by the voice action system, a request to enable testing of the new voice action, wherein the request specifies one or more devices for which the new voice action should be enabled; and
      
      enabling, by the voice action system, triggering of the new voice action for the one or more specified devices in response to the request, wherein triggering of the new voice action is disabled for devices that are not included in the specified devices.
  - 17. The method of claim 1, further comprising determining, based on the data, whether the new voice action is valid for the software application, and based on the determination that the new voice action is valid for the software application, inducting the new voice action to generate the voice action passive data structure.
  - 18. The method of claim 1, wherein the multiple voice action passive data structures include built-in voice actions that were submitted by at least a first application developer when the software application was built, and other voice actions that were submitted by at least a second application developer after the software application was built.
  - 19. The method of claim 18, wherein the multiple voice action passive data structures further include application-specific voice actions that are supported by default by an operating system on one or more of the devices.

20. A system comprising:
- one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising;
  
  receiving, by a voice action system, data defining a new voice action that does not currently exist for a software application installed on one or more devices, the software application being different from said voice action system, the data indicating one or more operations for the software application to perform the new voice action and one or more trigger terms for triggering the new voice action, wherein the data defining the new voice action specifies a context, the context specifying a status of a user device or of the software application installed on the user device;
  
  generating, by the voice action system, a voice action passive data structure based at least on the data defining the new voice action, wherein the voice action passive data structure comprises data that, when received by the software application, causes the software application to perform the one or more operations to perform the new voice action;
  
  associating, by the voice action system, the voice action passive data structure with the context and with the one or more trigger terms for triggering the new voice action, wherein multiple voice action passive data structures are defined in the voice action system;
  
  receiving, by the voice action system, (i) user command utterance obtained by the user device, the user device having the software application installed, and (ii) current context information regarding the user device;
  
  identifying, using the current context information and not a transcription of the user command utterance, a set of candidate voice action passive data structures from the multiple voice action passive data structures of the voice action system, the set of candidate voice action passive data structures including the voice action passive data structure defined by the data and being identified based on respective contexts associated with the set of candidate voice action passive data structures;
  
  narrowing the identified set of candidate voice action passive data structures by comparing the transcription of the user command utterance with trigger terms of respective ones of the set of candidate voice action passive data structures;
  
  determining, by the voice action system, that the transcription of the user command utterance corresponds to the one or more trigger terms associated with the voice action passive data structure; and
  
  in response to the determination, providing, by the voice action system, the voice action passive data structure to the user device which is remote from the voice action system, thereby causing the software application installed on the user device to perform the one or more operations to perform the new voice action.

21. A non-transitory computer-readable storage device storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising:
- receiving, by a voice action system, data defining a new voice action that does not currently exist for a software application installed on one or more devices, the software application being different from said voice action system, the data indicating one or more operations for the software application to perform the new voice action and one or more trigger terms for triggering the new voice action, wherein the data defining the new voice action specifies a context, the context specifying a status of a user device or of the software application installed on the user device;
  
  generating, by the voice action system, a voice action passive data structure based at least on the data defining the new voice action, wherein the voice action passive data structure comprises data that, when received by the software application, causes the software application to perform the one or more operations to perform the new voice action;
  
  associating, by the voice action system, the voice action passive data structure with the context and with the one or more trigger terms for triggering the new voice action, wherein multiple voice action passive data structures are defined in the voice action system;
  
  receiving, by the voice action system, (i) user command utterance obtained by the user device, the user device having the software application installed, and (ii) current context information regarding the user device;
  
  identifying, using the current context information and not a transcription of the user command utterance, a set of candidate voice action passive data structures from the multiple voice action passive data structures of the voice action system, the set of candidate voice action passive data structures including the voice action passive data structure defined by the data and being identified based on respective contexts associated with the set of candidate voice action passive data structures;
  
  narrowing the identified set of candidate voice action passive data structures by comparing the transcription of the user command utterance with trigger terms of respective ones of the set of candidate voice action passive data structures;
  
  determining, by the voice action system, that the transcription of the user command utterance corresponds to the one or more trigger terms associated with the voice action passive data structure; and
  
  in response to the determination, providing, by the voice action system, the voice action passive data structure to the user device which is remote from the voice action system, thereby causing the software application installed on the user device to perform the one or more operations to perform the new voice action.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google LLC (Alphabet Inc.)
Inventors
Wang, Bo, Vemuri, Sunil, James, Barnaby John, Huffman, Scott B., Gupta, Pravir Kumar
Primary Examiner(s)
Ortiz Sanchez, Michael

Application Number

US15/057,453
Publication Number

US 20170256256A1
Time in Patent Office

749 Days
Field of Search
US Class Current
CPC Class Codes

G10L 15/1822   Parsing for meaning underst...

G10L 15/22   Procedures used during a sp...

G10L 15/26   Speech to text systems G10L...

G10L 15/30   Distributed recognition, e....

G10L 2015/223   Execution procedure of a sp...

G10L 2015/227   of the speaker; Human-fact...

G10L 2015/228   of application context

Developer voice actions system

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

82 Citations

21 Claims

Specification

Solutions

Use Cases

Quick Links

Developer voice actions system

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

82 Citations

21 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links