Audibly indicating secondary content with spoken text

US 9,865,250 B1
Filed: 09/29/2014
Issued: 01/09/2018
Est. Priority Date: 09/29/2014
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method for navigating secondary content during a text-to-speech process, the method comprising:

outputting first audio including an audio tone preceded by first synthesized speech and followed by second synthesized speech, the audio tone corresponding to an indicator of a first footnote located in a string of text, the first synthesized speech associated with a portion of the string of text prior to the indicator and the second synthesized speech associated with a portion of the string of text following the indicator;

detecting first contact on a touch-screen of a computing device within a first period of time following output of the audio tone;

determining that the first contact corresponds to a predefined first arc gesture, the first contact extending along both a horizontal axis and a vertical axis from a first point to a second point, a difference between a first horizontal coordinate associated with the first point and a second horizontal coordinate associated with the second point exceeding a horizontal threshold in a first direction relative to the first point, and a difference between a first vertical coordinate associated with the first point and a second vertical coordinate associated with a midpoint of the contact exceeding a vertical threshold;

selecting the first footnote in response to the first arc gesture;

identifying supplemental text associated with the first footnote; and

outputting third synthesized speech corresponding to the supplemental text associated with the first footnote.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method for navigating secondary content. The system may monitor for gestures input to the system by an input device and may detect an arc gesture. The arc gesture may travel along both a horizontal axis and a vertical axis from a first point to a second point and may be delineated from a horizontal or a vertical motion. The system may identify secondary content corresponding to the arc gesture in response to the arc gesture and output data corresponding to the secondary content. The system may identify supplemental text associated with the secondary content and synthesize supplemental speech corresponding to the supplemental text. The output data may include audio including the synthesized supplemental speech.

10 Citations

View as Search Results

22 Claims

1. A computer-implemented method for navigating secondary content during a text-to-speech process, the method comprising:
- outputting first audio including an audio tone preceded by first synthesized speech and followed by second synthesized speech, the audio tone corresponding to an indicator of a first footnote located in a string of text, the first synthesized speech associated with a portion of the string of text prior to the indicator and the second synthesized speech associated with a portion of the string of text following the indicator;
  
  detecting first contact on a touch-screen of a computing device within a first period of time following output of the audio tone;
  
  determining that the first contact corresponds to a predefined first arc gesture, the first contact extending along both a horizontal axis and a vertical axis from a first point to a second point, a difference between a first horizontal coordinate associated with the first point and a second horizontal coordinate associated with the second point exceeding a horizontal threshold in a first direction relative to the first point, and a difference between a first vertical coordinate associated with the first point and a second vertical coordinate associated with a midpoint of the contact exceeding a vertical threshold;
  
  selecting the first footnote in response to the first arc gesture;
  
  identifying supplemental text associated with the first footnote; and
  
  outputting third synthesized speech corresponding to the supplemental text associated with the first footnote.
- View Dependent Claims (2, 3)
- - 2. The method of claim 1, further comprising:
    - detecting second contact on a touch-screen of a computing device within a second period of time following the first arc gesture;
      
      determining that the second contact corresponds to a predefined second arc gesture, the second contact extending along both a horizontal axis and a vertical axis from a third point to a fourth point, a difference between a third horizontal coordinate associated with the third point and a fourth horizontal coordinate associated with the fourth point exceeding a horizontal threshold in a second direction relative to the third point, and a difference between a third vertical coordinate associated with the third point and a fourth vertical coordinate associated with a midpoint of the contact exceeding a vertical threshold;
      
      identifying a second footnote in the string of text prior to the first footnote, in response to the second arc gesture;
      
      identifying supplemental text associated with the second footnote in response to the second arc gesture; and
      
      outputting fourth synthesized speech corresponding to the supplemental text associated with the second footnote in response to the second arc gesture.
  - 3. The method of claim 1, further comprising:
    - generating the third synthesized speech corresponding to the supplemental text, the third synthesized speech having different voice parameters than at least one of the first synthesized speech and the second synthesized speech.

4. A computer-implemented method comprising:
- outputting first audio including an audio tone preceded by first synthesized speech and followed by second synthesized speech, the audio tone associated with an indicator of first secondary content located in a string of text and based on a type of the first secondary content, the first synthesized speech associated with a portion of the string of text prior to the indicator and the second synthesized speech associated with a portion of the string of text following the indicator;
  
  detecting first contact on a touch-screen of a computing device;
  
  determining that the first contact corresponds to a first arc gesture the first contact extending along both a horizontal axis and a vertical axis from a first point to a second point, a difference between a first horizontal coordinate associated with the first point and a second horizontal coordinate associated with the second point exceeding a horizontal threshold in a first direction relative to the first point, and a difference between a first vertical coordinate associated with the first point and a second vertical coordinate associated with a midpoint of the first contact exceeding a vertical threshold;
  
  selecting the first secondary content that corresponds to the first arc gesture;
  
  identifying supplemental text associated with the first secondary content; and
  
  outputting second audio corresponding to the supplemental text in response to the first arc gesture.
- View Dependent Claims (5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
- - 5. The computer-implemented method of claim 4, further comprising:
    - outputting third audio including a second audio tone preceded by third synthesized speech and followed by fourth synthesized speech, the second audio tone associated with a second indicator of second secondary content located in a second string of text and based on a type of the second secondary content, the third synthesized speech associated with a portion of the second string of text prior to the second indicator and the fourth synthesized speech associated with a portion of the second string of text following the second indicator;
      
      determining that a duration of time has elapsed without detecting the first arc gesture; and
      
      outputting fourth audio associated with a third string of text following the second string of text.
  - 6. The computer-implemented method of claim 4, whereinthe identifying further comprises identifying a most recent secondary content item transmitted by a text-to-speech process.
  - 7. The computer-implemented method of claim 4, further comprising:
    - generating supplemental synthesized speech corresponding to the supplemental text, the supplemental synthesized speech having different voice parameters than at least one of the first synthesized speech and the second synthesized speech,wherein the second audio includes the supplemental synthesized speech.
  - 8. The computer-implemented method of claim 4, wherein the method further comprises:
    - detecting second contact on the touch-screen;
      
      determining that the second contact corresponds to a second arc gesture, the second contact extending along both a horizontal axis and a vertical axis from a third point to a fourth point, a difference between a third horizontal coordinate associated with the third point and a fourth horizontal coordinate associated with the fourth point exceeding a horizontal threshold, and a difference between a third vertical coordinate associated with the third point and a fourth vertical coordinate associated with a midpoint of the second input exceeding a vertical threshold.
  - 9. The computer-implemented method of claim 8, further comprising:
    - identifying second secondary content in response to the horizontal difference between the third point and the fourth point being in a second direction relative to the third point; and
      
      outputting third audio corresponding to the second secondary content in response to the second arc gesture.
  - 10. The computer-implemented method of claim 8, further comprising:
    - identifying third secondary content in response to the horizontal difference between the third point and the fourth point being in a third direction relative to the third point; and
      
      outputting third audio corresponding to the third secondary content in response to the second arc gesture.
  - 11. The computer-implemented method of claim 8, further comprising:
    - outputting third audio including third synthesized speech associated with a portion of the string of text following the second synthesized speech.
  - 12. The computer-implemented method of claim 5, wherein the secondary content includes at least one of a footnote, an endnotes, a definition, a synonym, or a translation.
  - 13. The computer-implemented method of claim 5, further comprising:
    - determining a first direction between a first horizontal coordinate associated with the first point and a second horizontal coordinate associated with the second point,determining a second direction between a first vertical coordinate associated with the first point and a second vertical coordinate associated with a midpoint of the coordinates; and
      
      determining a configuration of the first arc gesture based on the first direction and the second direction.
  - 14. The computer-implemented method of claim 5, further comprising:
    - determining a predefined spiraling arc gesture based on coordinates of a second input, the coordinates extending in a circular manner from a first point through a second point to a third point near the first point and along both a horizontal axis and a vertical axis from the third point to a fourth point, a difference between a first horizontal coordinate associated with the third point and a second horizontal coordinate associated with the fourth point exceeding a horizontal threshold, and a difference between a first vertical coordinate associated with a vertical maximum of the second input and a second vertical coordinate associated with a vertical minimum of the second input exceeding a vertical threshold; and
      
      navigating from a current content item to a different content item in response to detecting the spiraling arc gesture.

15. A computing device comprising:
- one or more processors; and
  
  a memory including instructions operable to be executed by the one or more processors to perform a set of actions to configure the device to;
  
  output first audio including an audio tone preceded by first synthesized speech and followed by second synthesized speech, the audio tone associated with an indicator of first secondary content located in a string of text and based on a type of the first secondary content, the first synthesized speech associated with a portion of the string of text prior to the indicator and the second synthesized speech associated with a portion of the string of text following the indicator;
  
  detect first contact on a touch-screen of a computing device;
  
  determine that the first contact corresponds to a first arc gesture, the first contact extending along both a horizontal axis and a vertical axis from a first point to a second point, a difference between a first horizontal coordinate associated with the first point and a second horizontal coordinate associated with the second point exceeding a horizontal threshold in a first direction relative to the first point, and a difference between a first vertical coordinate associated with the first point and a second vertical coordinate associated with a midpoint of the first contact exceeding a vertical threshold;
  
  select the first secondary content that corresponds to the first arc gesture;
  
  identify supplemental text associated with the first secondary content; and
  
  output second audio corresponding to the supplemental text in response to the first arc gesture.
- View Dependent Claims (16, 17, 18, 19, 20, 21, 22)
- - 16. The computing device of claim 15, wherein the instructions further configure the device toidentify a most recent secondary content item transmitted by a text-to-speech process.
  - 17. The computing device of claim 15, wherein the instructions further configure the device to:
    - generate supplemental synthesized speech corresponding to the supplemental text, the supplemental synthesized speech having different voice parameters than at least one of the first synthesized speech and the second synthesized speech,wherein the second audio includes the supplemental synthesized speech.
  - 18. The computing device of claim 15, wherein the instructions further configure the system to:
    - detecting second contact on the touch-screen;
      
      determine that the second contact corresponds to a second arc gesture, the second contact extending along both a horizontal axis and a vertical axis from a third point to a fourth point, a difference between a third horizontal coordinate associated with the third point and a fourth horizontal coordinate associated with the fourth point exceeding a horizontal threshold in a second direction relative to the third point, and a difference between a third vertical coordinate associated with the third point and a fourth vertical coordinate associated with a midpoint of the second input exceeding a vertical threshold.
  - 19. The computing device of claim 18, wherein the instructions further configure the device to:
    - identify second secondary content in response to the horizontal difference between the third point and the fourth point being in a second direction relative to the third point; and
      
      output third audio corresponding to the second secondary content in response to the second arc gesture.
  - 20. The computing device of claim 18, wherein the instructions further configure the device to:
    - identify third secondary content in response to the horizontal difference between the third point and the fourth point being in a third direction relative to the third point; and
      
      output third audio corresponding to the third secondary content in response to the second arc gesture.
  - 21. The computing device of claim 15, wherein the secondary content includes at least one of a footnote, an endnotes, a definition, a synonym, or a translation.
  - 22. The computing device of claim 15, wherein the instructions further configure the device to:
    - determine a first direction between a first horizontal coordinate associated with the first point and a second horizontal coordinate associated with the second point,determine a second direction between a first vertical coordinate associated with the first point and a second vertical coordinate associated with a midpoint of the coordinates; and
      
      determine a configuration of the first arc gesture based on the first direction and the second direction.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Original Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Inventors
Korn, Peter Alex
Primary Examiner(s)
Vo, Huyen
Assistant Examiner(s)
Chavez, Rodrigo

Application Number

US14/499,489
Time in Patent Office

1,198 Days
Field of Search

704260, 704235, 704E15001, 704 2, 704202, 704207, 704209, 704219, 704231, 704244, 704257, 7042701, 704275, 704 9, 386249, 386250, 386282
US Class Current
CPC Class Codes

G06F 3/0488   using a touch-screen or dig...

G06F 3/04883   for inputting data by handw...

G06F 3/167   Audio in a user interface, ...

G10L 13/00   Speech synthesis; Text to s...

G10L 13/08   Text analysis or generation...

Audibly indicating secondary content with spoken text

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

10 Citations

22 Claims

Specification

Solutions

Use Cases

Quick Links

Audibly indicating secondary content with spoken text

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

10 Citations

22 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links