Ebook interaction using speech recognition
First Claim
1. A computer-implemented method of interacting with an ebook using speech recognition, comprising:
- presenting, on a display of a user device, a displayed portion of an ebook;
receiving, at the user device, audio data of a user reading aloud the displayed portion of an ebook;
converting a portion of the audio data to spoken-text data;
determining one or more similarity scores based on a comparison between the spoken-text data and text data associated with the displayed portion of the ebook;
weighting the one or more similarity scores based in part on a location of a previously emphasized word that corresponds to a previous reading location of the user in the text data and a time elapsed since the previously emphasized word was emphasized;
ranking the one or more weighted similarity scores;
determining a reading location in the text data that corresponds to the spoken-text data based on the ranking of the one or more weighted similarity scores;
determining a pronunciation score using the spoken-text data and pronunciation data associated with the text data at the reading location;
performing one or more actions based in part on the reading location and the pronunciation score, the one or more actions including;
responsive to the pronunciation score meeting a threshold value, selecting an emphasis type based on the pronunciation score, andemphasizing the text data at the reading location in accordance with the selected emphasis type; and
outputting the performed one or more actions on the display of the user device.
2 Assignments
0 Petitions
Accused Products
Abstract
A user device receives audio data of a user reading aloud a displayed portion of an ebook, and converts a portion of the audio data to spoken-text data. The user device determines one or more similarity scores based on a comparison between the spoken-text data and text data associated with the displayed portion of the ebook, and ranks the one or more similarity scores to determine a reading location in the text data that corresponds to the spoken-text data. The user device determines a pronunciation score using the spoken-text data and pronunciation data associated with the text data at the reading location. The user device performs an action based in part on the reading location and the pronunciation score.
14 Citations
14 Claims
-
1. A computer-implemented method of interacting with an ebook using speech recognition, comprising:
-
presenting, on a display of a user device, a displayed portion of an ebook; receiving, at the user device, audio data of a user reading aloud the displayed portion of an ebook; converting a portion of the audio data to spoken-text data; determining one or more similarity scores based on a comparison between the spoken-text data and text data associated with the displayed portion of the ebook; weighting the one or more similarity scores based in part on a location of a previously emphasized word that corresponds to a previous reading location of the user in the text data and a time elapsed since the previously emphasized word was emphasized; ranking the one or more weighted similarity scores; determining a reading location in the text data that corresponds to the spoken-text data based on the ranking of the one or more weighted similarity scores; determining a pronunciation score using the spoken-text data and pronunciation data associated with the text data at the reading location; performing one or more actions based in part on the reading location and the pronunciation score, the one or more actions including; responsive to the pronunciation score meeting a threshold value, selecting an emphasis type based on the pronunciation score, and emphasizing the text data at the reading location in accordance with the selected emphasis type; and outputting the performed one or more actions on the display of the user device. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A non-transitory computer-readable storage medium storing executable computer program instructions for interacting with an ebook using speech recognition, the instructions executable to perform steps comprising:
-
presenting, on a display of a user device, a displayed portion of an ebook; receiving, at the user device, audio data of a user reading aloud the displayed portion of an ebook; converting a portion of the audio data to spoken-text data; determining one or more similarity scores based on a comparison between the spoken-text data and text data associated with the displayed portion of the ebook; weighting the one or more similarity scores based in part on a location of a previously emphasized word that corresponds to a previous reading location of the user in the text data and a time elapsed since the previously emphasized word was emphasized; ranking the one or more weighted similarity scores; determining a reading location in the text data that corresponds to the spoken-text data based on the ranking of the one or more weighted similarity scores; determining a pronunciation score using the spoken-text data and pronunciation data associated with the text data at the reading location; performing one or more actions based in part on the reading location and the pronunciation score, the one or more actions including; responsive to the pronunciation score meeting a threshold value, selecting an emphasis type based on the pronunciation score, and emphasizing the text data at the reading location in accordance with the selected emphasis type; and outputting the performed one or more actions on the display of the user device. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A system for interacting with an ebook using speech recognition, comprising:
-
a display; a processor configured to execute modules; and a memory storing the modules, the modules comprising; an ebook reader configured to present on the display a displayed portion of an ebook; a speech-to-text module configured to; receive audio data of a user reading aloud the displayed portion of an ebook, and convert a portion of the audio data to spoken-text data; a correlation module configured to; determine one or more similarity scores based on a comparison between the spoken-text data and text data associated with the displayed portion of the ebook, and weight the one or more similarity scores based in part on a location of a previously emphasized word that corresponds to a previous reading location of the user in the text data and a time elapsed since the previously emphasized word was emphasized; and a ranking module configured to; rank the one or more weighted similarity scores, and determine a reading location in the text data that corresponds to the spoken-text data based on the ranking of the one or more weighted similarity scores; and a quality module configured to determine a pronunciation score using the spoken-text data and pronunciation data associated with the text data at the reading location; and an action module configured to; perform one or more actions based in part on the reading location and the pronunciation score, the one or more actions including;
responsive to the pronunciation score meeting a threshold value,
selecting an emphasis type based on the pronunciation score, and
emphasizing the text data at the reading location in accordance with the selected emphasis type; andoutput the performed one or more actions on the display of the user device. - View Dependent Claims (12, 13, 14)
-
Specification