TRAINING AN AT LEAST PARTIAL VOICE COMMAND SYSTEM

US 20140278413A1
Filed: 03/14/2014
Published: 09/18/2014
Est. Priority Date: 03/15/2013
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method for training a digital assistant, performed at an electronic device including one or more processors and memory storing instructions for execution by the one or more processors, the method comprising:

detecting an impasse during a dialogue between the digital assistant and a user, wherein the dialogue includes at least one speech input from the user;

in response to detecting the impasse, establishing a learning session associated with the at least one speech input;

during the learning session;

receiving one or more subsequent clarification inputs from the user;

based at least in part on the one or more subsequent clarification inputs, adjusting at least one of intent inference and task execution associated with the at least one speech input to produce a satisfactory response to the at least one speech input; and

associating the satisfactory response with the at least one speech input for processing future occurrences of the at least one speech input.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An electronic device with one or more processors and memory includes a procedure for training a digital assistant. In some embodiments, the device detects an impasse in a dialogue between the digital assistant and a user including a speech input. During a learning session, the device utilizes a subsequent clarification input from the user to adjust intent inference or task execution associated with the speech input to produce a satisfactory response. In some embodiments, the device identifies a pattern of success or failure associated with an aspect previously used to complete a task and generates a hypothesis regarding a parameter used in speech recognition, intent inference or task execution as a cause for the pattern. Then, the device tests the hypothesis by altering the parameter for a subsequent completion of the task and adopts or rejects the hypothesis based on feedback information collected from the subsequent completion.

Citations

24 Claims

1. A computer-implemented method for training a digital assistant, performed at an electronic device including one or more processors and memory storing instructions for execution by the one or more processors, the method comprising:
- detecting an impasse during a dialogue between the digital assistant and a user, wherein the dialogue includes at least one speech input from the user;
  
  in response to detecting the impasse, establishing a learning session associated with the at least one speech input;
  
  during the learning session;
  
  receiving one or more subsequent clarification inputs from the user;
  
  based at least in part on the one or more subsequent clarification inputs, adjusting at least one of intent inference and task execution associated with the at least one speech input to produce a satisfactory response to the at least one speech input; and
  
  associating the satisfactory response with the at least one speech input for processing future occurrences of the at least one speech input.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein detecting the impasse comprises:
    - during the dialogue between the digital assistant and the user;
      
      receiving at least one speech input from the user;
      
      inferring an initial intent based on the at least one speech input;
      
      providing an initial response to fulfill the initial intent that has been inferred; and
      
      receiving a follow-up speech input from the user rejecting the initial response.
  - 3. The method of claim 2, wherein the initial intent is a best guess;
    - and during the learning session, the method further comprising;
      
      prior to receiving the one or more subsequent clarification inputs from the user, inferring a second intent based on the at least one speech input, wherein the second intent is a second best guess and the second intent is distinct from the initial intent; and
      
      providing a second response to fulfill the second intent that has been inferred.
  - 4. The method of claim 1, wherein the impasse comprises one of a set consisting of:
    - a single user rejection of an initial response distinct from the satisfactory response;
      
      two or more user rejections of the initial response; and
      
      a user command ending the dialogue.
  - 5. The method of claim 1, further comprising:
    - during the learning session;
      
      prior to receiving the one or more subsequent clarification inputs from the user, providing two or more alternative responses to the at least one speech input from the user.
  - 6. The method of claim 1, further comprising:
    - during the learning session;
      
      reducing a respective intent inference or speech recognition threshold so as to generate the two or more alternative responses to the at least one speech input from the user.
  - 7. The method of claim 1, further comprising:
    - during the learning session;
      
      prior to receiving the one or more subsequent clarification inputs from the user, rephrasing at least a portion of the at least one speech input from the user to elicit one or more subsequent clarification inputs from the user.
  - 8. The method of claim 1, wherein associating the satisfactory response with the speech input comprises replacing a respective initial response to the at least speech input with the satisfactory response for a set of users in a community of users.

9. A computer-implemented method for training a digital assistant, performed at an electronic device including one or more processors and memory storing instructions for execution by the one or more processors, the method comprising:
- obtaining feedback information associated with one or more previous completions of a task;
  
  identifying a pattern of success or failure associated with an aspect of speech recognition, intent inference or task execution previously used to complete the task;
  
  generating a hypothesis regarding a parameter used in at least one of speech recognition, intent inference and task execution as a cause for the pattern of success or failure;
  
  identifying one or more subsequent requests for completion of the task;
  
  testing the hypothesis by altering the parameter used in the at least one of speech recognition, intent inference and task execution for subsequent completions of the task; and
  
  adopting or rejecting the hypothesis based on feedback information collected from the subsequent completions of the task.
- View Dependent Claims (10, 11, 12, 13, 14, 15)
- - 10. The method of claim 9, wherein feedback information comprises a user rejection of a response associated with a completion of a task.
  - 11. The method of claim 9, wherein feedback information comprises a user location subsequent to a completion of a task.
  - 12. The method of claim 9, wherein feedback information comprises a length of time spent at a location nearby a completed task.
  - 13. The method of claim 9, wherein feedback information comprises a user action subsequent to a completion of a task.
  - 14. The method of claim 9, wherein testing the hypothesis occurs when a hypothesis confidence value associated with the hypothesis exceeds a predetermined confidence threshold.
  - 15. The method of claim 9, wherein adopting or rejecting the hypothesis based on feedback information collected from the subsequent completions of the task comprises determining whether a task completion metric associated with the task has improved.

16. An electronic device, comprising:
- a sound receiving unit configured to receive sound input;
  
  a speaker unit configured to output sound; and
  
  a processing unit coupled to the sound receiving unit and the speaker unit, the processing unit configured to;
  
  detect an impasse during a dialogue between the digital assistant and a user, wherein the dialogue includes at least one speech input from the user;
  
  in response to detecting the impasse, establish a learning session associated with the at least one speech input;
  
  during the learning session;
  
  receive one or more subsequent clarification inputs from the user;
  
  based at least in part on the one or more subsequent clarification inputs, adjust at least one of intent inference and task execution associated with the at least one speech input to produce a satisfactory response to the at least one speech input; and
  
  associate the satisfactory response with the at least one speech input for processing future occurrences of the at least one speech input.
- View Dependent Claims (17, 18, 19)
- - 17. The electronic device of claim 16, wherein detecting the impasse comprises:
    - during the dialogue between the digital assistant and the user, the processing unit is further configured to;
      
      receive at least one speech input from the user;
      
      infer an initial intent based on the at least one speech input;
      
      provide an initial response to fulfill the initial intent that has been inferred; and
      
      receive a follow-up speech input from the user rejecting the initial response.
  - 18. The electronic device of claim 16, wherein the initial intent is a best guess;
    - and during the learning session, the processing unit is further configured to;
      
      prior to receiving the one or more subsequent clarification inputs from the user, infer a second intent based on the at least one speech input, wherein the second intent is a second best guess and the second intent is distinct from the initial intent; and
      
      provide a second response to fulfill the second intent that has been inferred.
  - 19. The electronic device of claim 16, wherein the impasse comprises one of a set consisting of:
    - a single user rejection of an initial response distinct from the satisfactory response;
      
      two or more user rejections of the initial response; and
      
      a user command ending the dialogue.

20. A computer-readable storage medium storing one or more programs, the one or more programs comprising instructions, which, when executed by an electronic device with one or more processors, cause the device to:
- detect an impasse during a dialogue between the digital assistant and a user, wherein the dialogue includes at least one speech input from the user;
  
  in response to detecting the impasse, establish a learning session associated with the at least one speech input;
  
  during the learning session;
  
  receive one or more subsequent clarification inputs from the user;
  
  based at least in part on the one or more subsequent clarification inputs, adjust at least one of intent inference and task execution associated with the at least one speech input to produce a satisfactory response to the at least one speech input; and
  
  associate the satisfactory response with the at least one speech input for processing future occurrences of the at least one speech input.
- View Dependent Claims (21, 22, 23, 24)
- - 21. The computer-readable storage medium of claim 20, wherein detecting the impasse comprises:
    - during the dialogue between the digital assistant and the user;
      
      receive at least one speech input from the user;
      
      infer an initial intent based on the at least one speech input;
      
      provide an initial response to fulfill the initial intent that has been inferred; and
      
      receive a follow-up speech input from the user rejecting the initial response.
  - 22. The computer-readable storage medium of claim 20, wherein the impasse comprises one of a set consisting of:
    - a single user rejection of an initial response distinct from the satisfactory response;
      
      two or more user rejections of the initial response; and
      
      a user command ending the dialogue.
  - 23. The computer-readable storage medium of claim 20, further comprising instructions operable to:
    - during the learning session;
      
      prior to receiving the one or more subsequent clarification inputs from the user, provide two or more alternative responses to the at least one speech input from the user.
  - 24. The computer-readable storage medium of claim 20, further comprising instruction operable to:
    - during the learning session;
      
      reduce a respective intent inference or speech recognition threshold so as to generate the two or more alternative responses to the at least one speech input from the user.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Apple Inc.
Original Assignee
Apple Inc.
Inventors
BRIGHAM, Christopher D., GRUBER, Thomas R., PITSCHEL, Donald W., CHEYER, Adam J.

Granted Patent

US 9,922,642 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/243
CPC Class Codes

G10L 15/063   Training

G10L 15/22   Procedures used during a sp...

G10L 2015/223   Execution procedure of a sp...

G10L 2015/227   of the speaker; Human-fact...

TRAINING AN AT LEAST PARTIAL VOICE COMMAND SYSTEM

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

24 Claims

Specification

Solutions

Use Cases

Quick Links

TRAINING AN AT LEAST PARTIAL VOICE COMMAND SYSTEM

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

24 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links