Voice interaction application selection

US 9,741,343 B1
Filed: 12/19/2013
Issued: 08/22/2017
Est. Priority Date: 12/19/2013
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method comprising:

obtaining a first list identifying a plurality of software applications configured to process voice interactions;

receiving first audio data corresponding to a first voice interaction;

performing speech recognition processing on the first audio data to obtain first text;

sequentially querying the plurality of software applications in an order of the first list to determine whether a queried software application can process the first text, wherein the sequential querying continues until a first queried software application responds that it is able to process the first text;

selecting the first queried software application;

determining a first time corresponding to the first voice interaction;

processing, using the first queried software application, the first text to generate one or more results;

causing output of audio corresponding to the one or more results;

receiving second audio data corresponding to a second voice interaction after causing the output of the audio;

determining a second time corresponding to the second voice interaction;

determining that a certain amount of time has not elapsed between the first time and the second time;

performing speech recognition processing on the second audio data to obtain second text; and

querying the first queried software application to determine whether the first queried software application can process the second text, prior to querying any other software application of the plurality of software applications.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An open framework for computing devices to dispatch voice-based interactions to supporting applications. Applications are selected on a trial-and-error basis to find an application able to handle the voice interaction. Dispatching to the applications may be performed without a determination of meaning conveyed in the interaction, with meaning determined by the individual applications. Once an application acts upon a voice interaction, that application may be given first-right-of-refusal for subsequent voice interactions.

210 Citations

20 Claims

1. A computer-implemented method comprising:
- obtaining a first list identifying a plurality of software applications configured to process voice interactions;
  
  receiving first audio data corresponding to a first voice interaction;
  
  performing speech recognition processing on the first audio data to obtain first text;
  
  sequentially querying the plurality of software applications in an order of the first list to determine whether a queried software application can process the first text, wherein the sequential querying continues until a first queried software application responds that it is able to process the first text;
  
  selecting the first queried software application;
  
  determining a first time corresponding to the first voice interaction;
  
  processing, using the first queried software application, the first text to generate one or more results;
  
  causing output of audio corresponding to the one or more results;
  
  receiving second audio data corresponding to a second voice interaction after causing the output of the audio;
  
  determining a second time corresponding to the second voice interaction;
  
  determining that a certain amount of time has not elapsed between the first time and the second time;
  
  performing speech recognition processing on the second audio data to obtain second text; and
  
  querying the first queried software application to determine whether the first queried software application can process the second text, prior to querying any other software application of the plurality of software applications.
- View Dependent Claims (2, 3, 4)
- - 2. The computer-implemented method of claim 1, further comprising:
    - receiving third audio data corresponding to a third voice interaction, wherein the second audio data is received at the second time, the third audio data is received at a third time, and a time difference between the second time and the third time is greater than a specific amount of time;
      
      performing speech recognition processing on the third audio data to obtain third text;
      
      sequentially querying the plurality of software applications to determine whether a queried software application can process the third text, wherein the further sequential querying queries the plurality of software applications in the order of the first list and continues until a queried application responds that it is able to process the third text, andselecting a second application that responds that it is able to process the third text.
  - 3. The computer-implemented method of claim 1, further comprising:
    - sorting the first list identifying the plurality of software applications to produce a second list in response to the first application processing the first text, wherein the second list is ordered based on a likelihood that each software application will be used after the first application,wherein after the first application is unable to process the second text, querying the plurality of software applications based on their order in the second list.
  - 4. The computer-implemented method of claim 1, wherein the first list is sorted based on default device settings.

5. A computing device comprising:
- a communication interface;
  
  at least one processor; and
  
  a memory including instructions operable to be executed by the at least one processor to perform a set of actions, configuring the computing device to;
  
  receive first audio data corresponding to a first voice interaction;
  
  sequentially query a plurality of software applications in an order of a first list of software applications until a queried application responds that it is able to process the first voice interaction;
  
  select, from the plurality of software applications, a first application that responds that it is able to process the first voice interaction;
  
  determine a first time corresponding to the first voice interaction;
  
  receive second audio data corresponding to a second voice interaction after the first application processed the first voice interaction;
  
  determine a second time corresponding to the second voice interaction;
  
  determine that a certain amount of time has not elapsed between the first time and the second time; and
  
  query the first application to determine whether the first application can process the second voice interaction, prior to querying any of the other plurality of software applications to determine if any of the other plurality of software applications can process the second voice interaction.
- View Dependent Claims (6, 7, 8, 9, 10, 11, 12)
- - 6. The computing device of claim 5, wherein the instructions further configure the computing device to:
    - receive third audio data corresponding to a third voice interaction after the first application has processed the first voice interaction and the second voice interaction;
      
      determine a third time corresponding to the third voice interaction;
      
      determine that the certain amount of time has elapsed between the first time and the third time; and
      
      query a second application to determine whether the second application can process the third voice interaction, prior to querying any other software application of the plurality of software applications, wherein the second application has a higher priority than the first application within the first list.
  - 7. The computing device of claim 5, wherein the instructions further configure the computing device to:
    - sort the first list to produce a second list in response to the first application processing the first voice interaction;
      
      determine that the first application is unable to process the second voice interaction; and
      
      query the plurality of software applications based on an order of the second list.
  - 8. The computing device of claim 7, wherein the second list is ordered based on a likelihood that each software application will be used after the first application.
  - 9. The computing device of claim 7, wherein the instructions further configure the computing device to:
    - perform speech recognition processing on the first audio data to obtain text corresponding to the first voice interaction,wherein the first list is sorted to produce the second list based at least in part on the text.
  - 10. The computing device of claim 7, wherein the instructions further configure the computing device to sort the first list based at least in part on how frequently each software application is used.
  - 11. The computing device of claim 7, wherein the instructions further configure the computing device to:
    - select a second application that responds that it is able to process the second voice interaction; and
      
      move the second application to the top of the second list to create a third list.
  - 12. The computing device of claim 5, wherein the instructions further configure the computing device to:
    - perform processing on the first audio data to obtain at least one of speech recognition results or natural language understanding (NLU) results; and
      
      wherein the query to the first application further comprises;
      
      providing the first application at least one of the first audio data, the speech recognition results, or the NLU results.

13. A non-transitory computer-readable storage medium storing processor-executable instructions for controlling a computing device, comprising program code to configure the computing device to:
- receive first audio data corresponding to a first voice interaction;
  
  sequentially query a plurality of software applications in an order of a first list of software applications until a queried application responds that it is able to process the first voice interaction;
  
  select, from the plurality of software applications, a first application that responds that it is able to process the first voice interaction;
  
  determine a first time corresponding to the first voice interaction;
  
  receive second audio data corresponding to a second voice interaction after the first application processed the first voice interaction;
  
  determine a second time corresponding to the second voice interaction;
  
  determine that a certain amount of time has not elapsed between the first time and the second time; and
  
  query the first application to determine whether the first application can process the second voice interaction, prior to querying any of the other plurality of software applications to determine if any of the other plurality of software applications can process the second voice interaction.
- View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
- - 14. The non-transitory computer-readable storage medium of claim 13, the program code further configuring the computing device to:
    - receive third audio data corresponding to a third voice interaction after the first application has processed the first voice interaction and the second voice interaction;
      
      determine a third time corresponding to the third voice interaction;
      
      determine that the certain amount of time has elapsed between the first time and the third time; and
      
      query a second application to determine whether the second application can process the third voice interaction, prior to querying any other software application of the plurality of software applications, wherein the second application has a higher priority than the first application within the first list.
  - 15. The non-transitory computer-readable storage medium of claim 13, the program code further configuring the computing device to:
    - sort the first list to produce a second list in response to the first application processing the first voice interaction;
      
      determine that the first application is unable to process the second voice interaction; and
      
      query the plurality of software applications based on an order of the second list.
  - 16. The non-transitory computer-readable storage medium of claim 15, wherein the second list is ordered based on a likelihood that each software application will be used after the first application.
  - 17. The non-transitory computer-readable storage medium of claim 15, the program code further configuring the computing device to:
    - perform speech recognition processing on the first audio data to obtain text corresponding to the first voice interaction,wherein the first list is sorted to produce the second list based at least in part on the text.
  - 18. The non-transitory computer-readable storage medium of claim 15, the program code further configuring the computing device to sort the first list based at least in part on how frequently each software application is used.
  - 19. The non-transitory computer-readable storage medium of claim 15, the program code further configuring the computing device to:
    - select a second application that responds that it is able to process the second voice interaction; and
      
      move the second application to the top of the second list to create a third list.
  - 20. The non-transitory computer-readable storage medium of claim 13, the program code further configuring the computing device to:
    - perform processing on the first audio data to obtain at least one of speech recognition results or natural language understanding (NLU) results; and
      
      wherein the query to the first application further comprises;
      
      providing the first application at least one of the first audio data, the speech recognition results, or the NLU results.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Original Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Inventors
Miles, Andrew Christopher, Deramat, Frdric Johan Georges, Manthei, Cory Curtis, Gundeti, Vikram Kumar, Jain, Vikas
Primary Examiner(s)
Desir, Pierre-Louis
Assistant Examiner(s)
Kovacek, David

Application Number

US14/134,941
Time in Patent Office

1,342 Days
Field of Search

None
US Class Current
CPC Class Codes

G10L 15/00   Speech recognition G10L17/0...

G10L 15/01   Assessment or evaluation of...

G10L 15/22   Procedures used during a sp...

G10L 15/26   Speech to text systems G10L...

G10L 2015/223   Execution procedure of a sp...

G10L 2015/228   of application context

Voice interaction application selection

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

210 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Voice interaction application selection

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

210 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links