MULTI-COMMAND SINGLE UTTERNACE INPUT METHOD

US 20200043482A1
Filed: 10/14/2019
Published: 02/06/2020
Est. Priority Date: 05/30/2014
Status: Active Grant

First Claim

Patent Images

1. A non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors of an electronic device, cause the electronic device to:

receive speech input, wherein the speech input comprises a single utterance having two or more actionable commands;

generate a text string based on the speech input using a speech transcription process, wherein the speech transcription process is performed using one or more speech recognition models;

identify a first keyword in the text string;

identify a second keyword in the text string;

parse the text string into at least a first candidate substring and a second candidate sub string based at least in part on a position of a conjunction word between the first keyword and the second keyword;

determine a first intent associated with the first candidate substring and a second intent associated with the second candidate substring, wherein the first intent corresponds to a first actionable command in the speech input and the second intent corresponds to a second actionable command in the speech input, wherein the first intent and the second intent are determined based on one or more nodes of an ontology; and

execute a first process identified by the first intent and a second process identified by the second intent.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems and processes are disclosed for handling a multi-part voice command for a virtual assistant. Speech input can be received from a user that includes multiple actionable commands within a single utterance. A text string can be generated from the speech input using a speech transcription process. The text string can be parsed into multiple candidate substrings based on domain keywords, imperative verbs, predetermined substring lengths, or the like. For each candidate substring, a probability can be determined indicating whether the candidate substring corresponds to an actionable command. Such probabilities can be determined based on semantic coherence, similarity to user request templates, querying services to determine manageability, or the like. If the probabilities exceed a threshold, the user intent of each sub string can be determined, processes associated with the user intents can be executed, and an acknowledgment can be provided to the user.

70 Citations

17 Claims

1. A non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors of an electronic device, cause the electronic device to:
- receive speech input, wherein the speech input comprises a single utterance having two or more actionable commands;
  
  generate a text string based on the speech input using a speech transcription process, wherein the speech transcription process is performed using one or more speech recognition models;
  
  identify a first keyword in the text string;
  
  identify a second keyword in the text string;
  
  parse the text string into at least a first candidate substring and a second candidate sub string based at least in part on a position of a conjunction word between the first keyword and the second keyword;
  
  determine a first intent associated with the first candidate substring and a second intent associated with the second candidate substring, wherein the first intent corresponds to a first actionable command in the speech input and the second intent corresponds to a second actionable command in the speech input, wherein the first intent and the second intent are determined based on one or more nodes of an ontology; and
  
  execute a first process identified by the first intent and a second process identified by the second intent.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
- - 2. The computer readable storage medium of claim 1, wherein the first keyword corresponds to a first domain and the second keyword corresponds to a second domain.
  - 3. The computer readable storage medium of claim 1, wherein the first keyword is a first imperative verb and the second keyword is a second imperative verb.
  - 4. The computer readable storage medium of claim 1, wherein determining the first intent associated with the first candidate substring and the second intent associated with the second candidate substring comprises:
    - determining the second intent based on at least one word in the first candidate substring.
  - 5. The computer readable storage medium of claim 1, wherein determining the first intent associated with the first candidate substring and the second intent associated with the second candidate substring comprises:
    - determining the first intent or the second intent based on information displayed on a display associated with the electronic device.
  - 6. The computer readable storage medium of claim 5, wherein the information comprises a list;
    - andwherein determining the first intent associated with the first candidate substring and the second intent associated with the second candidate substring comprises;
      
      determining the first intent or the second intent based on an ordinal descriptor in the first candidate substring or the second candidate substring, wherein the ordinal descriptor is associated with one or more items in the list.
  - 7. The computer readable storage medium of claim 5, wherein the information comprises one or more notifications.
  - 8. The computer readable storage medium of claim 5, wherein the information comprises one or more emails.
  - 9. The computer readable storage medium of claim 5, wherein determining the first intent associated with the first candidate substring and the second intent associated with the second candidate substring comprises:
    - determining one or more potential user requests based on the information displayed on the display; and
      
      determining the first intent or the second intent based on the one or more potential user requests.
  - 10. The computer readable storage medium of claim 1, wherein the instructions further cause the electronic device to:
    - provide an acknowledgment that the first process and the second process have at least begun execution.
  - 11. The computer readable storage medium of claim 10, wherein providing the acknowledgment associated with the first intent and the second intent comprises:
    - providing a first task associated with the first intent and a second task associated with the second intent.
  - 12. The computer readable storage medium of claim 11, wherein the instructions further cause the electronic device to:
    - in response to completing the first process, provide a first indicator associated with the first task; and
      
      in response to completing the second process, provide a second indicator associated with the second task.
  - 13. The computer readable storage medium of claim 11, wherein the instructions further cause the electronic device to:
    - before completing the first process, provide a first processing status indicator associated with the first task; and
      
      before completing the second process, provide a second processing status indicator associated with the second task.
  - 14. The computer readable storage medium of claim 10, wherein providing the acknowledgment associated with the first intent and the second intent comprises:
    - displaying the first candidate substring using a first emphasis and displaying the second candidate substring using a second emphasis that is different than the first emphasis.
  - 15. The computer readable storage medium of claim 14, wherein each of the first emphasis and the second emphasis comprise one or more of bold text, italic text, underlined text, circled text, outlined text, colored text, and clustered text.

16. An electronic device, comprising:
- one or more processors; and
  
  memory storing one or more programs configured to be executed by the one or more processors, the one or more programs including instructions for;
  
  receiving speech input, wherein the speech input comprises a single utterance having two or more actionable commands;
  
  generating a text string based on the speech input using a speech transcription process, wherein the speech transcription process is performed using one or more speech recognition models;
  
  identifying a first keyword in the text string;
  
  identifying a second keyword in the text string;
  
  parsing the text string into at least a first candidate substring and a second candidate substring based at least in part on a position of a conjunction word between the first keyword and the second keyword;
  
  determining a first intent associated with the first candidate substring and a second intent associated with the second candidate substring, wherein the first intent corresponds to a first actionable command in the speech input and the second intent corresponds to a second actionable command in the speech input, wherein the first intent and the second intent are determined based on one or more nodes of an ontology; and
  
  executing a first process identified by the first intent and a second process identified by the second intent.

17. A computer-implemented method, comprising:
- at an electronic device with one or more processors and memory;
  
  receiving speech input, wherein the speech input comprises a single utterance having two or more actionable commands;
  
  generating a text string based on the speech input using a speech transcription process, wherein the speech transcription process is performed using one or more speech recognition models;
  
  identifying a first keyword in the text string;
  
  identifying a second keyword in the text string;
  
  parsing the text string into at least a first candidate substring and a second candidate substring based at least in part on a position of a conjunction word between the first keyword and the second keyword;
  
  determining a first intent associated with the first candidate substring and a second intent associated with the second candidate substring, wherein the first intent corresponds to a first actionable command in the speech input and the second intent corresponds to a second actionable command in the speech input, wherein the first intent and the second intent are determined based on one or more nodes of an ontology; and
  
  executing a first process identified by the first intent and a second process identified by the second intent.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Apple Inc.
Original Assignee
Apple Inc.
Inventors
GRUBER, Thomas R., SADDLER, Harry J., BELLEGARDA, Jerome Rene, NYEGGEN, Bryce H., SABATELLI, Alessandro

Granted Patent

US 10,878,809 B2
Time in Patent Office

Days
Field of Search
US Class Current
CPC Class Codes

G06F 40/205   Parsing

G10L 15/1815   Semantic context, e.g. disa...

G10L 15/1822   Parsing for meaning underst...

G10L 15/26   Speech to text systems G10L...

G10L 15/28   Constructional details of s...

G10L 2015/088   Word spotting

G10L 2015/221   Announcement of recognition...

G10L 2015/223   Execution procedure of a sp...

G10L 2015/225   Feedback of the input speech

G10L 2015/228   of application context

H04M 2203/355   Interactive dialogue design...

H04M 3/4936   Speech interaction details ...

MULTI-COMMAND SINGLE UTTERNACE INPUT METHOD

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

70 Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

MULTI-COMMAND SINGLE UTTERNACE INPUT METHOD

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

70 Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links