STRUCTURED DICTATION USING INTELLIGENT AUTOMATED ASSISTANTS

US 20160260433A1
Filed: 08/28/2015
Published: 09/08/2016
Est. Priority Date: 03/06/2015
Status: Active Grant

First Claim

Patent Images

1. A method for operating a digital assistant, the method comprising:

at an electronic device;

receiving a speech input representing a user request;

receiving metadata associated with the speech input;

determining a text string corresponding to the speech input;

determining, based on the metadata, whether to perform natural language processing on the text string;

in response to determining that natural language processing is to be performed on the text string;

determining whether the metadata identifies one or more domains corresponding to the user request;

in response to determining that the metadata identifies one or more domains corresponding to the user request;

generating, using the text string and based on the one or more domains, a structured query representing an actionable intent associated with the one or more domains;

executing a task flow associated with the structured query;

determining whether a result satisfying the user request is obtained from executing the task flow; and

in response to determining that a result satisfying the user request is obtained from executing the task flow, outputting data content containing the result.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems and processes for structured dictation using intelligent automated assistants are provided. In one example process, a speech input representing a user request can be received. In addition, metadata associated with the speech input can be received. A text string corresponding to the speech input can be determined. The process can determine whether to perform natural language processing on the text string and whether the metadata identifies one or more domains corresponding to the user request. In response to the determination that natural language processing is to be performed on the text string and that the metadata identifies one or more domains corresponding to the user request, natural language processing of the text string can be constrained to the one or more domains. A result can be obtained based on the one or more domains and the result can be outputted from the electronic device.

224 Citations

25 Claims

1. A method for operating a digital assistant, the method comprising:
- at an electronic device;
  
  receiving a speech input representing a user request;
  
  receiving metadata associated with the speech input;
  
  determining a text string corresponding to the speech input;
  
  determining, based on the metadata, whether to perform natural language processing on the text string;
  
  in response to determining that natural language processing is to be performed on the text string;
  
  determining whether the metadata identifies one or more domains corresponding to the user request;
  
  in response to determining that the metadata identifies one or more domains corresponding to the user request;
  
  generating, using the text string and based on the one or more domains, a structured query representing an actionable intent associated with the one or more domains;
  
  executing a task flow associated with the structured query;
  
  determining whether a result satisfying the user request is obtained from executing the task flow; and
  
  in response to determining that a result satisfying the user request is obtained from executing the task flow, outputting data content containing the result.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
- - 2. The method of claim 1, wherein the one or more domains are domains of an ontology, and wherein the ontology includes a plurality of other domains different from the one or more domains.
  - 3. The method of claim 2, wherein in response to determining that the metadata identifies one or more domains associated with the user request, the plurality of other domains are disabled such that natural language processing is not performed on the text string using the plurality of other domains.
  - 4. The method of claim 2, wherein in response to determining that the metadata identifies one or more domains associated with the user request, the structured query is not generated using the plurality of other domains.
  - 5. The method of claim 2, wherein in response to determining that the metadata identifies one or more domains associated with the user request, the one or more domains are the only domains of the ontology used to generate the structured query.
  - 6. The method of claim 1, wherein the speech input is associated with an input field of an application, wherein the one or more domains are identified by one or more attributes in the metadata, and wherein the one or more attributes are based on the input field and the application.
  - 7. The method of claim 6, wherein whether natural language processing is to be performed on the text string is determined based on the one or more attributes.
  - 8. The method of claim 6, wherein the one or more attributes define one or more topics corresponding to the input field of the application, and wherein the one or more domains associated with the user request are identified according to the one or more topics.
  - 9. The method of claim 6, wherein:
    - the input field is a recipient field;
      
      the application is an electronic text-based communication application;
      
      the one or more attributes identify one or more domains associated with contact information;
      
      executing the task flow includes searching a contact database in accordance with search constraints defined in the structured query; and
      
      the result identifies contact information to be populated into the recipient field.
  - 10. The method of claim 9, wherein the speech input is received from a user, and wherein the speech input defines a recipient based on a relationship of the recipient to the user.
  - 11. The method of claim 6, wherein:
    - the input field is a location search field;
      
      the application is a maps application;
      
      the one or more attributes identify one or more domains associated with location information;
      
      executing the task flow includes searching a location database in accordance with search constraints defined in the structured query; and
      
      the result identifies location information to be presented using the maps application.
  - 12. The method of claim 11, wherein the metadata defines a geographic area in which the searching of the location database is to be confined.
  - 13. The method of claim 1, further comprising:
    - in response to determining that natural language processing is not to be performed on the text string, outputting the text string.
  - 14. The method of claim 1, further comprising:
    - in response to determining that the metadata does not identify one or more domains corresponding to the speech input;
      
      determining, using the text string, a relevant domain associated with the speech input;
      
      generating, using the text string and based on the relevant domain, a second structured query representing an actionable intent associated with the relevant domain;
      
      executing a second task flow associated with the second structured query to obtain a second result in furtherance of satisfying the user request; and
      
      outputting second data content containing the second result.
  - 15. The method of claim 1, further comprising:
    - in response to determining that a result satisfying the user request is not obtained from executing the task flow, outputting the text string.
  - 16. The method of claim 1, wherein generating the structured query further comprises:
    - parsing the text string based on the one or more domains to identify relevant information required for the actionable intent;
      
      populating a parameter of the structured query with the relevant information; and
      
      deriving a programmatic representation of the user request based on the parameter, wherein the task flow is based on the programmatic representation of the user request.
  - 17. The method of claim 1, wherein the metadata defines a second parameter of the structured query.
  - 18. The method of claim 1, wherein the speech input includes one or more ambiguous terms, and wherein the result at least partially disambiguates the one or more ambiguous terms.
  - 19. The method of claim 1, wherein the data content includes instructions for performing a task using the result.

20. A method for operating a digital assistant, the method comprising:
- at an electronic device with a display system and a microphone;
  
  displaying, on the display system, an application comprising one or more text input fields;
  
  receiving, via the microphone, a speech input;
  
  determining a text string corresponding to the speech input;
  
  determining whether a focus of the application is within the one or more text input fields;
  
  in accordance with a determination that the focus of the application is within the one or more text input fields;
  
  constraining natural language processing of the text string to a domain of two or more domains; and
  
  outputting a result based on the domain; and
  
  in accordance with a determination that the focus of the application is outside the one or more text input fields;
  
  performing natural language processing of the text string across the two or more domains; and
  
  outputting a result based on the two or more domains.
- View Dependent Claims (21, 22, 23)
- - 21. The method of claim 20, wherein the application is an email application, and wherein the text input field is a recipient field of the email application.
  - 22. The method of claim 21, further comprising:
    - constraining the domain to contacts stored on the electronic device.
  - 23. The method of claim 20, wherein the application is a maps application.

24. A non-transitory computer-readable storage medium comprising computer-executable instructions, which when executed by one or more processors, causes the one or more processors to:
- receive a speech input representing a user request;
  
  receive metadata associated with the speech input;
  
  determine a text string corresponding to the speech input;
  
  determine, based on the metadata, whether to perform natural language processing on the text string;
  
  in response to determining that natural language processing is to be performed on the text string;
  
  determine whether the metadata identifies one or more domains corresponding to the user request;
  
  in response to determining that the metadata identifies one or more domains corresponding to the user request;
  
  generate, using the text string and based on the one or more domains, a structured query representing an actionable intent associated with the one or more domains;
  
  execute a task flow associated with the structured query;
  
  determine whether a result satisfying the user request is obtained from executing the task flow; and
  
  in response to determining that a result satisfying the user request is obtained from executing the task flow, output data content containing the result.

25. A system comprising:
- one or more processors;
  
  memory storing computer-readable instructions, which when executed by the one or more processors, causes the one or more processors to;
  
  receive a speech input representing a user request;
  
  receive metadata associated with the speech input;
  
  determine a text string corresponding to the speech input;
  
  determine, based on the metadata, whether to perform natural language processing on the text string;
  
  in response to determining that natural language processing is to be performed on the text string;
  
  determine whether the metadata identifies one or more domains corresponding to the user request;
  
  in response to determining that the metadata identifies one or more domains corresponding to the user request;
  
  generate, using the text string and based on the one or more domains, a structured query representing an actionable intent associated with the one or more domains;
  
  execute a task flow associated with the structured query;
  
  determine whether a result satisfying the user request is obtained from executing the task flow; and
  
  in response to determining that a result satisfying the user request is obtained from executing the task flow, output data content containing the result.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Apple Inc.
Original Assignee
Apple Inc.
Inventors
SUMNER, Michael R., NEWENDORP, Brandon J., ORR, Ryan M.

Granted Patent

US 9,865,280 B2
Time in Patent Office

Days
Field of Search
US Class Current

1/1
CPC Class Codes

G06F 16/29   Geographical information da...

G06F 16/3344   using natural language anal...

G06F 16/90332   Natural language query form...

G06F 40/30   Semantic analysis

G10L 15/26   Speech to text systems G10L...

G10L 25/48   specially adapted for parti...

STRUCTURED DICTATION USING INTELLIGENT AUTOMATED ASSISTANTS

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

224 Citations

25 Claims

Specification

Solutions

Use Cases

Quick Links

STRUCTURED DICTATION USING INTELLIGENT AUTOMATED ASSISTANTS

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

224 Citations

25 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links