Proactive assistance based on dialog communication between devices

US 10,942,703 B2
Filed: 01/16/2019
Issued: 03/09/2021
Est. Priority Date: 12/23/2015
Status: Active Grant

First Claim

Patent Images

1. A non-transitory computer-readable medium storing instructions for providing proactive assistance based on dialog communication between devices, the instructions, when executed by one or more processors, cause the one or more processors to:

while voice communication is established between an electronic device and a second electronic device;

receive a stream of audio data associated with the second electronic device;

identify, based on at least one sentence boundary, a plurality of portions of the stream of audio data;

store the plurality of portions of the stream of audio data;

detect a user input;

in response to detecting the user input, generate a text representation of speech contained in a first portion of the plurality of portions of the stored audio data;

determine whether the text representation contains information corresponding to one of a plurality of types of information;

in response to determining that the text representation contains information corresponding to one of a plurality of types of information, determine whether the information is complete;

in response to determining that the information is not complete;

generate a text representation of speech contained in a second portion of the plurality of portions of the stored audio data; and

obtain second information from the second portion of the plurality of portions of the stored audio data;

perform one or more tasks based on at least the information and the second information.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems and processes for proactive assistance based on dialog communication between devices are provided. In one example process, while voice communication between an electronic device and a second electronic device is established, a stream of audio data associated with the second electronic device can be received. In response to detecting a user input, a text representation of speech contained in a portion of the stream of audio data can be generated. The process can determine whether the text representation contains information corresponding to one of a plurality of types of information. In response to determining that the text representation contains information corresponding to one of a plurality of types of information, one or more tasks based on the information can be performed.

2573 Citations

16 Claims

1. A non-transitory computer-readable medium storing instructions for providing proactive assistance based on dialog communication between devices, the instructions, when executed by one or more processors, cause the one or more processors to:
- while voice communication is established between an electronic device and a second electronic device;
  
  receive a stream of audio data associated with the second electronic device;
  
  identify, based on at least one sentence boundary, a plurality of portions of the stream of audio data;
  
  store the plurality of portions of the stream of audio data;
  
  detect a user input;
  
  in response to detecting the user input, generate a text representation of speech contained in a first portion of the plurality of portions of the stored audio data;
  
  determine whether the text representation contains information corresponding to one of a plurality of types of information;
  
  in response to determining that the text representation contains information corresponding to one of a plurality of types of information, determine whether the information is complete;
  
  in response to determining that the information is not complete;
  
  generate a text representation of speech contained in a second portion of the plurality of portions of the stored audio data; and
  
  obtain second information from the second portion of the plurality of portions of the stored audio data;
  
  perform one or more tasks based on at least the information and the second information.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
- - 2. The computer-readable medium of claim 1, wherein the second portion of stored audio data corresponds to audio that is less recent that the first portion of the stored audio data.
  - 3. The computer-readable medium of claim 1, wherein performing one or more tasks based on at least the information and the second information comprises performing one or more tasks based on both the information and the second information.
  - 4. The computer-readable medium of claim 1, wherein determining whether the information is complete comprises:
    - determining whether the information is missing at least one parameter.
  - 5. The computer-readable medium of claim 1, wherein the information includes a telephone number, wherein the one or more tasks include displaying the telephone number, and wherein the instructions further cause the one or more processors to:
    - in response to detecting a user selection of the displayed telephone number, initiate a voice call based on the telephone number.
  - 6. The computer-readable medium of claim 1, wherein the information includes a telephone number, wherein the one or more tasks include displaying the telephone number, and wherein the instructions further cause the one or more processors to:
    - in response to detecting a user selection of the displayed telephone number, store the telephone number in association with an address book of the electronic device.
  - 7. The computer-readable medium of claim 1, wherein the information includes an email address, wherein the one or more tasks include displaying the email address, and wherein theinstructions further cause the one or more processors to:
    - in response to detecting a user selection of the displayed email address, initiate a composition of an email message, wherein a recipient of the email message is based on the email address.
  - 8. The computer-readable medium of claim 1, wherein the information includes a location, and wherein the one or more tasks include displaying a map indicating the location.
  - 9. The computer-readable medium of claim 1, wherein the one or more tasks are performed after dialog communication has ended.
  - 10. The computer-readable medium of claim 1, wherein the one or more tasks includes a plurality of tasks, and performing one or more tasks based on at least the information and the second information comprises:
    - performing at least a first task of the plurality of tasks while dialog communication is established; and
      
      performing at least a second task of the plurality of tasks after dialog communication has ended.
  - 11. The computer-readable medium of claim 1, wherein performing one or more tasks based on at least the information and the second information comprises providing at least one sound output.
  - 12. The computer-readable medium of claim 1, wherein performing one or more tasks based on at least the information and the second information comprises providing at least one haptic output.
  - 13. The computer-readable medium of claim 1, wherein identifying, based on at least one sentence boundary, a plurality of portions of the stream of audio data comprises:
    - detecting an audio amplitude of the stream of audio data; and
      
      identifying a first sentence boundary based on the audio amplitude decreasing below a first threshold level.
  - 14. The computer-readable medium of claim 13, wherein identifying a first sentence boundary based on the audio amplitude decreasing below a first threshold level comprises:
    - within a predetermined time interval;
      
      detecting the audio amplitude decrease from above the first threshold level; and
      
      detecting the audio amplitude decrease to below a second threshold level.

15. An electronic device, comprising:
- one or more processors;
  
  a memory; and
  
  one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for;
  
  while voice communication is established between the electronic device and a second electronic device;
  
  receiving a stream of audio data associated with the second electronic device;
  
  identifying, based on at least one sentence boundary, a plurality of portions of the stream of audio data;
  
  storing the plurality of portions of the stream of audio data;
  
  detecting a user input;
  
  in response to detecting the user input, generating a text representation of speech contained in a first portion of the plurality of portions of the stored audio data;
  
  determining whether the text representation contains information corresponding to one of a plurality of types of information;
  
  in response to determining that the text representation contains information corresponding to one of a plurality of types of information, determining whether the information is complete;
  
  in response to determining that the information is not complete;
  
  generating a text representation of speech contained in a second portion of the plurality of portions of the stored audio data; and
  
  obtaining second information from the second portion of the plurality of portions of the stored audio data;
  
  performing one or more tasks based on at least the information and the second information.

16. A method, comprising:
- at an electronic device with one or more processors and memory;
  
  while voice communication is established between the electronic device and a second electronic device;
  
  receiving a stream of audio data associated with the second electronic device;
  
  identifying, based on at least one sentence boundary, a plurality of portions of the stream of audio data;
  
  storing the plurality of portions of the stream of audio data;
  
  detecting a user input;
  
  in response to detecting the user input, generating a text representation of speech contained in a first portion of the plurality of portions of the stored audio data;
  
  determining whether the text representation contains information corresponding to one of a plurality of types of information;
  
  in response to determining that the text representation contains information corresponding to one of a plurality of types of information, determining whether the information is complete;
  
  in response to determining that the information is not complete;
  
  generating a text representation of speech contained in a second portion of the plurality of portions of the stored audio data; and
  
  obtaining second information from the second portion of the plurality of portions of the stored audio data;
  
  performing one or more tasks based on at least the information and the second information.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Apple Inc.
Original Assignee
Apple Inc.
Inventors
Martel, Mathieu Jean, Deniau, Thomas
Primary Examiner(s)
Lerner, Martin

Application Number

US16/249,301
Publication Number

US 20190220245A1
Time in Patent Office

783 Days
Field of Search

704231, 704251, 704254, 704270, 704275, 704253
US Class Current
CPC Class Codes

G06F 16/3329   Natural language query form...

G06F 3/165   Management of the audio str...

G06F 3/167   Audio in a user interface, ...

G06F 40/279   Recognition of textual enti...

G06F 40/30   Semantic analysis

G10L 15/22   Procedures used during a sp...

G10L 15/26   Speech to text systems G10L...

G10L 15/30   Distributed recognition, e....

Proactive assistance based on dialog communication between devices

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

2573 Citations

16 Claims

Specification

Solutions

Use Cases

Quick Links

Proactive assistance based on dialog communication between devices

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

2573 Citations

16 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links