Training an at least partial voice command system
First Claim
1. A computer-implemented method for training a digital assistant, performed at an electronic device including one or more processors and memory storing instructions for execution by the one or more processors, the method comprising:
- detecting an impasse during a dialogue between the digital assistant and a user, wherein the dialogue includes at least one speech input from the user, wherein the at least one speech input includes a plurality of words;
in response to detecting the impasse, establishing a learning session associated with the at least one speech input;
during the learning session;
receiving one or more subsequent clarification inputs from the user;
based at least in part on the one or more subsequent clarification inputs, adjusting at least one of intent inference and task execution associated with the at least one speech input to produce a satisfactory response to the at least one speech input; and
associating the satisfactory response with the entirety of the at least one speech input for processing future occurrences of the at least one speech input, wherein the associating comprises replacing a respective initial response shared by a set of users in a community of users with the satisfactory response to the at least one speech input from the user.
1 Assignment
0 Petitions
Accused Products
Abstract
An electronic device with one or more processors and memory includes a procedure for training a digital assistant. In some embodiments, the device detects an impasse in a dialog between the digital assistant and a user including a speech input. During a learning session, the device utilizes a subsequent clarification input from the user to adjust intent inference or task execution associated with the speech input to produce a satisfactory response. In some embodiments, the device identifies a pattern of success or failure associated with an aspect previously used to complete a task and generates a hypothesis regarding a parameter used in speech recognition, intent inference or task execution as a cause for the pattern. Then, the device tests the hypothesis by altering the parameter for a subsequent completion of the task and adopts or rejects the hypothesis based on feedback information collected from the subsequent completion.
3002 Citations
22 Claims
-
1. A computer-implemented method for training a digital assistant, performed at an electronic device including one or more processors and memory storing instructions for execution by the one or more processors, the method comprising:
-
detecting an impasse during a dialogue between the digital assistant and a user, wherein the dialogue includes at least one speech input from the user, wherein the at least one speech input includes a plurality of words; in response to detecting the impasse, establishing a learning session associated with the at least one speech input; during the learning session; receiving one or more subsequent clarification inputs from the user; based at least in part on the one or more subsequent clarification inputs, adjusting at least one of intent inference and task execution associated with the at least one speech input to produce a satisfactory response to the at least one speech input; and associating the satisfactory response with the entirety of the at least one speech input for processing future occurrences of the at least one speech input, wherein the associating comprises replacing a respective initial response shared by a set of users in a community of users with the satisfactory response to the at least one speech input from the user. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. An electronic device, comprising:
-
a sound receiving unit configured to receive sound input; a speaker unit configured to output sound; and a processing unit coupled to the sound receiving unit and the speaker unit, the processing unit configured to; detect an impasse during a dialogue between the digital assistant and a user, wherein the dialogue includes at least one speech input from the user, wherein the at least one speech includes a plurality of words; in response to detecting the impasse, establish a learning session associated with the at least one speech input; during the learning session; receive one or more subsequent clarification inputs from the user; based at least in part on the one or more subsequent clarification inputs, adjust at least one of intent inference and task execution associated with the at least one speech input to produce a satisfactory response to the at least one speech input; and associate the satisfactory response with the entirety of the at least one speech input for processing future occurrences of the at least one speech input, wherein the associating comprises replacing a respective initial response shared by a set of users in a community of users with the satisfactory response to the at least one speech input from the user. - View Dependent Claims (11, 12, 13, 14, 15)
-
-
16. A non-transitory computer-readable storage medium storing one or more programs, the one or more programs comprising instructions, which, when executed by an electronic device with one or more processors, cause the device to:
-
detect an impasse during a dialogue between the digital assistant and a user, wherein the dialogue includes at least one speech input from the user, wherein the at least one speech input includes a plurality of words; in response to detecting the impasse, establish a learning session associated with the at least one speech input; during the learning session; receive one or more subsequent clarification inputs from the user; based at least in part on the one or more subsequent clarification inputs, adjust at least one of intent inference and task execution associated with the at least one speech input to produce a satisfactory response to the at least one speech input; and associate the satisfactory response with the entirety of the at least one speech input for processing future occurrences of the at least one speech input, wherein the associating comprises replacing a respective initial response shared by a set of users in a community of users with the satisfactory response to the at least one speech input from the user. - View Dependent Claims (17, 18, 19, 20, 21, 22)
-
Specification