Ebook interaction using speech recognition

US 9,548,052 B2
Filed: 12/17/2013
Issued: 01/17/2017
Est. Priority Date: 12/17/2013
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method of interacting with an ebook using speech recognition, comprising:

presenting, on a display of a user device, a displayed portion of an ebook;

receiving, at the user device, audio data of a user reading aloud the displayed portion of an ebook;

converting a portion of the audio data to spoken-text data;

determining one or more similarity scores based on a comparison between the spoken-text data and text data associated with the displayed portion of the ebook;

weighting the one or more similarity scores based in part on a location of a previously emphasized word that corresponds to a previous reading location of the user in the text data and a time elapsed since the previously emphasized word was emphasized;

ranking the one or more weighted similarity scores;

determining a reading location in the text data that corresponds to the spoken-text data based on the ranking of the one or more weighted similarity scores;

determining a pronunciation score using the spoken-text data and pronunciation data associated with the text data at the reading location;

performing one or more actions based in part on the reading location and the pronunciation score, the one or more actions including;

responsive to the pronunciation score meeting a threshold value, selecting an emphasis type based on the pronunciation score, andemphasizing the text data at the reading location in accordance with the selected emphasis type; and

outputting the performed one or more actions on the display of the user device.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A user device receives audio data of a user reading aloud a displayed portion of an ebook, and converts a portion of the audio data to spoken-text data. The user device determines one or more similarity scores based on a comparison between the spoken-text data and text data associated with the displayed portion of the ebook, and ranks the one or more similarity scores to determine a reading location in the text data that corresponds to the spoken-text data. The user device determines a pronunciation score using the spoken-text data and pronunciation data associated with the text data at the reading location. The user device performs an action based in part on the reading location and the pronunciation score.

14 Citations

View as Search Results

14 Claims

1. A computer-implemented method of interacting with an ebook using speech recognition, comprising:
- presenting, on a display of a user device, a displayed portion of an ebook;
  
  receiving, at the user device, audio data of a user reading aloud the displayed portion of an ebook;
  
  converting a portion of the audio data to spoken-text data;
  
  determining one or more similarity scores based on a comparison between the spoken-text data and text data associated with the displayed portion of the ebook;
  
  weighting the one or more similarity scores based in part on a location of a previously emphasized word that corresponds to a previous reading location of the user in the text data and a time elapsed since the previously emphasized word was emphasized;
  
  ranking the one or more weighted similarity scores;
  
  determining a reading location in the text data that corresponds to the spoken-text data based on the ranking of the one or more weighted similarity scores;
  
  determining a pronunciation score using the spoken-text data and pronunciation data associated with the text data at the reading location;
  
  performing one or more actions based in part on the reading location and the pronunciation score, the one or more actions including;
  
  responsive to the pronunciation score meeting a threshold value, selecting an emphasis type based on the pronunciation score, andemphasizing the text data at the reading location in accordance with the selected emphasis type; and
  
  outputting the performed one or more actions on the display of the user device.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The computer-implemented method of claim 1, wherein performing one or more actions based in part on the reading location and the pronunciation score comprises:
    - emphasizing text data that is adjacent to the reading location, in a manner that is different from the emphasized text data that corresponds to the reading location.
  - 3. The computer-implemented method of claim 1, wherein performing one or more actions in part on the reading location and the pronunciation score comprises:
    - responsive, to the reading location being at the end of the displayed portion of the ebook, automatically turning the page of the ebook.
  - 4. The computer-implemented method of claim 1, wherein performing one or more actions based in part on the reading location and the pronunciation score comprises:
    - responsive to the reading location being at the end of a reading region within the displayed portion of the ebook, automatically scrolling the text data such that the reading location is positioned at the beginning of the reading region.
  - 5. The computer-implemented method of claim 1, further comprising:
    - weighting the similarity scores by one or more additional factors, and wherein the additional factors are selected from a group consisting of;
      
      a location on the displayed portion of the ebook, elapsed time since a page was turned or text data scrolled, or some combination thereof.

6. A non-transitory computer-readable storage medium storing executable computer program instructions for interacting with an ebook using speech recognition, the instructions executable to perform steps comprising:
- presenting, on a display of a user device, a displayed portion of an ebook;
  
  receiving, at the user device, audio data of a user reading aloud the displayed portion of an ebook;
  
  converting a portion of the audio data to spoken-text data;
  
  determining one or more similarity scores based on a comparison between the spoken-text data and text data associated with the displayed portion of the ebook;
  
  weighting the one or more similarity scores based in part on a location of a previously emphasized word that corresponds to a previous reading location of the user in the text data and a time elapsed since the previously emphasized word was emphasized;
  
  ranking the one or more weighted similarity scores;
  
  determining a reading location in the text data that corresponds to the spoken-text data based on the ranking of the one or more weighted similarity scores;
  
  determining a pronunciation score using the spoken-text data and pronunciation data associated with the text data at the reading location;
  
  performing one or more actions based in part on the reading location and the pronunciation score, the one or more actions including;
  
  responsive to the pronunciation score meeting a threshold value, selecting an emphasis type based on the pronunciation score, andemphasizing the text data at the reading location in accordance with the selected emphasis type; and
  
  outputting the performed one or more actions on the display of the user device.
- View Dependent Claims (7, 8, 9, 10)
- - 7. The computer-readable medium of claim 6, wherein performing one or more actions based in part on the reading location and the pronunciation score comprises:
    - emphasizing text data that is adjacent to the reading location, in a manner that is different from the emphasized text data that corresponds to the reading location.
  - 8. The computer-readable medium of claim 6, wherein performing one or more actions based in part on the reading location and the pronunciation score comprises:
    - responsive, to the reading location being at the end of the displayed portion of the ebook, automatically turning the page of the ebook.
  - 9. The computer-readable medium of claim 6, wherein performing one or more actions based in part on the reading location and the pronunciation score comprises:
    - responsive to the reading location being at the end of a reading region within the displayed portion of the ebook, automatically scrolling the text data such that the reading location is positioned at the beginning of the reading region.
  - 10. The computer-readable medium of claim 6, further comprising:
    - weighting the similarity scores by one or more additional factors, and wherein the additional factors are selected from a group consisting of;
      
      a location on the displayed portion of the ebook, elapsed time since a page was turned or text data scrolled, or some combination thereof.

11. A system for interacting with an ebook using speech recognition, comprising:
- a display;
  
  a processor configured to execute modules; and
  
  a memory storing the modules, the modules comprising;
  
  an ebook reader configured to present on the display a displayed portion of an ebook;
  
  a speech-to-text module configured to;
  
  receive audio data of a user reading aloud the displayed portion of an ebook, andconvert a portion of the audio data to spoken-text data;
  
  a correlation module configured to;
  
  determine one or more similarity scores based on a comparison between the spoken-text data and text data associated with the displayed portion of the ebook, andweight the one or more similarity scores based in part on a location of a previously emphasized word that corresponds to a previous reading location of the user in the text data and a time elapsed since the previously emphasized word was emphasized; and
  
  a ranking module configured to;
  
  rank the one or more weighted similarity scores, anddetermine a reading location in the text data that corresponds to the spoken-text data based on the ranking of the one or more weighted similarity scores; and
  
  a quality module configured to determine a pronunciation score using the spoken-text data and pronunciation data associated with the text data at the reading location; and
  
  an action module configured to;
  
  perform one or more actions based in part on the reading location and the pronunciation score, the one or more actions including;
  
  responsive to the pronunciation score meeting a threshold value,
  
  selecting an emphasis type based on the pronunciation score, and
  
  emphasizing the text data at the reading location in accordance with the selected emphasis type; and
  
  output the performed one or more actions on the display of the user device.
- View Dependent Claims (12, 13, 14)
- - 12. The system of claim 11, wherein the action module is configured to emphasize text data that is adjacent to the reading location, in a manner that is different from the emphasized text data that corresponds to the reading location.
  - 13. The system of claim 11, wherein the action module is configured to, responsive, to the reading location being at the end of the displayed portion of the ebook, automatically turn the page of the ebook.
  - 14. The system of claim 11, wherein the action module is configured to, responsive to the reading location being at the end of a reading region within the displayed portion of the ebook, automatically scroll the text data such that the reading location is positioned at the beginning of the reading region.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Grady, Julian Paul, King, Virgil Scott
Primary Examiner(s)
PATEL, SHREYANS A

Application Number

US14/108,764
Publication Number

US 20150170648A1
Time in Patent Office

1,127 Days
Field of Search

704/235
US Class Current

1/1
CPC Class Codes

G06F 15/0291   for reading, e.g. e-books c...

G06F 3/0483   Interaction with page-struc...

G06F 3/04842   Selection of displayed obje...

G06F 3/0485   Scrolling or panning

G06F 3/167   Audio in a user interface, ...

G06F 40/10   Text processing natural lan...

G09G 2380/14   Electronic books and readers

G10L 15/22   Procedures used during a sp...

G10L 2015/223   Execution procedure of a sp...

Ebook interaction using speech recognition

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

14 Citations

14 Claims

Specification

Solutions

Use Cases

Quick Links

Ebook interaction using speech recognition

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

14 Citations

14 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links