×

Interactive eReader interface generation based on synchronization of textual and audial descriptors

  • US 10,671,251 B2
  • Filed: 12/22/2017
  • Issued: 06/02/2020
  • Est. Priority Date: 12/22/2017
  • Status: Active Grant
First Claim
Patent Images

1. A system for transforming eBooks and providing an improved eReader interface, comprising:

  • a processor coupled with memory and at least one database, wherein the processor is configured to;

    convert a digital book into image files and extract words, characters, and punctuation marks from the image files;

    generate textual descriptors, including at least a page number, a word or character length, and a language, for each of the extracted words, characters, and punctuation marks;

    store the extracted words, characters, and punctuation marks with the textual descriptors associated with the extracted words, characters, and punctuation marks in the database;

    retrieve an audio file for the corresponding digital book and identify timestamps of the audio file that correspond to specific words or characters;

    in accordance with the identified timestamps, apply keyframes at a beginning and an end of each word and segment the audio file into audio segments based on the keyframes;

    generate audial descriptors, including at least the keyframes, a corresponding word, audial runtime of the corresponding word, and a file size, for each audio segment;

    store the audio segments with their associated audial descriptors in the database;

    use a synchronization engine to pair the extracted words or characters;

    with the audio segments, wherein pairing the extracted words or characters with the audio segments includes;

    matching a sequence of the extracted words or characters with the associated textual descriptors stored in the database with a sequence of the audial descriptors for the audio segments stored in the database;

    aggregating a sequence of the matched textual descriptors and audial descriptors into synchronization data; and

    inserting the extracted punctuation marks into the aggregated sequence to be part of the synchronization data and outputting the synchronization data to a HyperText Markup Language (HTML) Generator;

    use the HTML Generator to transform the output of the synchronization engine into eReader-displayable content by embedding the output into tags;

    wherein embedding the output into tags includes outputting electronic markup, stylesheet, and/or semi-structured data for the extracted words, characters, punctuation marks, and the corresponding audio segments based on the synchronized textual descriptors and audial descriptors from the output of the synchronization engine; and

    a graphical user interface (GUI), wherein the GUI is configured to;

    display the electronic markup, stylesheet, and/or semi-structured data on a human-machine interface (HMI);

    highlight each of the words or the characters for a time based on the audial descriptors of the corresponding audio segments, including the keyframes at the beginning and the end of each word;

    adjust playback speed of an audio file based on a selection input; and

    receive a word or character selection input, modify a word or character highlight based on the word or character selection input, and initiate playback of the corresponding audio segments according to the word selection input and the synchronized textual descriptors and audial descriptors for the word or character from the output of the synchronization engine.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×