Interactive eReader interface generation based on synchronization of textual and audial descriptors
First Claim
Patent Images
1. A system for transforming eBooks and providing an improved eReader interface, comprising:
- a processor coupled with memory and at least one database, wherein the processor is configured to;
convert a digital book into image files and extract words, characters, and punctuation marks from the image files;
generate textual descriptors, including at least a page number, a word or character length, and a language, for each of the extracted words, characters, and punctuation marks;
store the extracted words, characters, and punctuation marks with the textual descriptors associated with the extracted words, characters, and punctuation marks in the database;
retrieve an audio file for the corresponding digital book and identify timestamps of the audio file that correspond to specific words or characters;
in accordance with the identified timestamps, apply keyframes at a beginning and an end of each word and segment the audio file into audio segments based on the keyframes;
generate audial descriptors, including at least the keyframes, a corresponding word, audial runtime of the corresponding word, and a file size, for each audio segment;
store the audio segments with their associated audial descriptors in the database;
use a synchronization engine to pair the extracted words or characters;
with the audio segments, wherein pairing the extracted words or characters with the audio segments includes;
matching a sequence of the extracted words or characters with the associated textual descriptors stored in the database with a sequence of the audial descriptors for the audio segments stored in the database;
aggregating a sequence of the matched textual descriptors and audial descriptors into synchronization data; and
inserting the extracted punctuation marks into the aggregated sequence to be part of the synchronization data and outputting the synchronization data to a HyperText Markup Language (HTML) Generator;
use the HTML Generator to transform the output of the synchronization engine into eReader-displayable content by embedding the output into tags;
wherein embedding the output into tags includes outputting electronic markup, stylesheet, and/or semi-structured data for the extracted words, characters, punctuation marks, and the corresponding audio segments based on the synchronized textual descriptors and audial descriptors from the output of the synchronization engine; and
a graphical user interface (GUI), wherein the GUI is configured to;
display the electronic markup, stylesheet, and/or semi-structured data on a human-machine interface (HMI);
highlight each of the words or the characters for a time based on the audial descriptors of the corresponding audio segments, including the keyframes at the beginning and the end of each word;
adjust playback speed of an audio file based on a selection input; and
receive a word or character selection input, modify a word or character highlight based on the word or character selection input, and initiate playback of the corresponding audio segments according to the word selection input and the synchronized textual descriptors and audial descriptors for the word or character from the output of the synchronization engine.
2 Assignments
0 Petitions
Accused Products
Abstract
The present invention is directed to systems and methods for providing an improved interactive and educational eBook platform through an improved eReader. The system provides a platform through which a book is transformed into an interactive, multi-language, assisted reading, read-aloud eBook and is displayed in an eReader with an improved graphical user interface that provides features which enhance the effectiveness of eBook learning.
107 Citations
11 Claims
-
1. A system for transforming eBooks and providing an improved eReader interface, comprising:
-
a processor coupled with memory and at least one database, wherein the processor is configured to; convert a digital book into image files and extract words, characters, and punctuation marks from the image files; generate textual descriptors, including at least a page number, a word or character length, and a language, for each of the extracted words, characters, and punctuation marks; store the extracted words, characters, and punctuation marks with the textual descriptors associated with the extracted words, characters, and punctuation marks in the database; retrieve an audio file for the corresponding digital book and identify timestamps of the audio file that correspond to specific words or characters; in accordance with the identified timestamps, apply keyframes at a beginning and an end of each word and segment the audio file into audio segments based on the keyframes; generate audial descriptors, including at least the keyframes, a corresponding word, audial runtime of the corresponding word, and a file size, for each audio segment; store the audio segments with their associated audial descriptors in the database; use a synchronization engine to pair the extracted words or characters;
with the audio segments, wherein pairing the extracted words or characters with the audio segments includes;matching a sequence of the extracted words or characters with the associated textual descriptors stored in the database with a sequence of the audial descriptors for the audio segments stored in the database; aggregating a sequence of the matched textual descriptors and audial descriptors into synchronization data; and inserting the extracted punctuation marks into the aggregated sequence to be part of the synchronization data and outputting the synchronization data to a HyperText Markup Language (HTML) Generator; use the HTML Generator to transform the output of the synchronization engine into eReader-displayable content by embedding the output into tags; wherein embedding the output into tags includes outputting electronic markup, stylesheet, and/or semi-structured data for the extracted words, characters, punctuation marks, and the corresponding audio segments based on the synchronized textual descriptors and audial descriptors from the output of the synchronization engine; and a graphical user interface (GUI), wherein the GUI is configured to; display the electronic markup, stylesheet, and/or semi-structured data on a human-machine interface (HMI); highlight each of the words or the characters for a time based on the audial descriptors of the corresponding audio segments, including the keyframes at the beginning and the end of each word; adjust playback speed of an audio file based on a selection input; and receive a word or character selection input, modify a word or character highlight based on the word or character selection input, and initiate playback of the corresponding audio segments according to the word selection input and the synchronized textual descriptors and audial descriptors for the word or character from the output of the synchronization engine. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
Specification