Aligning body matter across content formats
First Claim
Patent Images
1. A system for aligning content, the system comprising:
- an electronic data store configured to store;
an electronic book comprising;
a plurality of paragraphs of body text, andmatter other than body text, wherein the matter other than body text comprises text within at least front matter and back matter; and
an audiobook that is a companion to the electronic book; and
a physical computing device in communication with the electronic data store, the physical computing device configured to;
generate a textual transcription of the audiobook by applying a speech-to-text recognition routine on the audiobook;
identify a portion of the textual transcription that includes text also included in a paragraph of the electronic book;
determine a level of correlation between words in the paragraph of the electronic book and words in the portion of the textual transcription;
determine that the level of correlation satisfies a threshold value;
in response to determining that the level of correlation satisfies the threshold value, identify the paragraph of the electronic book as body text;
identify a first portion of the electronic book that does not satisfy the threshold value with respect to the textual transcription;
determine that the first portion of the electronic book that does not satisfy the threshold value is front matter based at least in part on a determination that the first portion of the electronic book that does not satisfy the threshold value appears within the electronic book prior to an earliest portion of the electronic book for which a corresponding portion of the audiobook is identified;
identify a second portion of the electronic book that does not satisfy the threshold value with respect to the textual transcription;
determine that the second portion of the electronic book that does not satisfy the threshold value is back matter based at least in part on a determination that the second portion of the electronic book that does not satisfy the threshold value appears within the electronic book after a last portion of the electronic book for which a corresponding portion of the audiobook is identified; and
generate content synchronization information that identifies (a) portions of the audiobook that correspond to the paragraphs of the body text and (b) further identifies the matter other than body text in the electronic book, wherein the content synchronization information indicates that the matter other than body text in the electronic book, including the first portion and second portion of the electronic book, does not correspond to any portion of the audiobook,wherein the content synchronization information indicates that the paragraph, excluding the matter other than body text, should be presented in synchronization with a portion of the audiobook from which the corresponding portion of the textual transcription was generated.
1 Assignment
0 Petitions
Accused Products
Abstract
A content alignment service is described that may generate content synchronization information to facilitate the synchronous presentation of corresponding audio content and textual content. In some embodiments, portions of body text (as opposed to front matter, such as a table of contents; or back matter, such as an index) in the textual content are identified and synchronized with corresponding audio content. In one example application, an audiobook may be synchronized with an electronic book. As the body text portions of the electronic book are consumed, corresponding words of the audiobook may be audibly presented.
-
Citations
21 Claims
-
1. A system for aligning content, the system comprising:
-
an electronic data store configured to store; an electronic book comprising; a plurality of paragraphs of body text, and matter other than body text, wherein the matter other than body text comprises text within at least front matter and back matter; and an audiobook that is a companion to the electronic book; and a physical computing device in communication with the electronic data store, the physical computing device configured to; generate a textual transcription of the audiobook by applying a speech-to-text recognition routine on the audiobook; identify a portion of the textual transcription that includes text also included in a paragraph of the electronic book; determine a level of correlation between words in the paragraph of the electronic book and words in the portion of the textual transcription; determine that the level of correlation satisfies a threshold value; in response to determining that the level of correlation satisfies the threshold value, identify the paragraph of the electronic book as body text; identify a first portion of the electronic book that does not satisfy the threshold value with respect to the textual transcription; determine that the first portion of the electronic book that does not satisfy the threshold value is front matter based at least in part on a determination that the first portion of the electronic book that does not satisfy the threshold value appears within the electronic book prior to an earliest portion of the electronic book for which a corresponding portion of the audiobook is identified; identify a second portion of the electronic book that does not satisfy the threshold value with respect to the textual transcription; determine that the second portion of the electronic book that does not satisfy the threshold value is back matter based at least in part on a determination that the second portion of the electronic book that does not satisfy the threshold value appears within the electronic book after a last portion of the electronic book for which a corresponding portion of the audiobook is identified; and generate content synchronization information that identifies (a) portions of the audiobook that correspond to the paragraphs of the body text and (b) further identifies the matter other than body text in the electronic book, wherein the content synchronization information indicates that the matter other than body text in the electronic book, including the first portion and second portion of the electronic book, does not correspond to any portion of the audiobook, wherein the content synchronization information indicates that the paragraph, excluding the matter other than body text, should be presented in synchronization with a portion of the audiobook from which the corresponding portion of the textual transcription was generated. - View Dependent Claims (2, 3)
-
-
4. A computer-implemented method for aligning content, the computer-implemented method comprising:
as implemented by one or more computing devices configured with specific computer-executable instructions, obtaining a textual transcription of an item of content comprising audio content; identifying a portion of the textual transcription that includes text also included in a portion of a companion item of textual content, wherein the textual content includes body text and matter other than body text; determining a level of correlation between words in the portion of the companion item of textual content and words in the portion of the textual transcription; determining that the level of correlation satisfies a threshold value; in response to determining that the level of correlation satisfies a threshold value, identifying the portion of the companion item of textual content as including body text; identifying a second portion of the companion item of textual content that does not satisfy the threshold value with respect to any portion of the textual transcription; determining that the second portion of the companion item of textual content that does not satisfy the threshold value is front matter based at least in part on a determination that the second portion of the companion item of textual content that does not satisfy the threshold value appears within the companion item of textual content prior to an earliest portion of the companion item of textual content for which a corresponding portion of the audio content is identified; and generating content synchronization information that indicates (a) portions of the audio content that correspond to body text of the companion item of textual content and (b) further indicates that the matter other than body text in the textual content does not correspond to any portion of the audio content, wherein the matter other than body text includes the second portion of the companion item of textual content determined to be front matter, wherein the content synchronization information indicates that the body text included in the portion of the companion item of textual content should be presented in synchronization with a portion of the audio content that corresponds to the body text included in the portion of the textual transcription. - View Dependent Claims (5, 6, 7, 8, 9, 10, 17, 20)
-
11. A system for aligning content, the system comprising:
-
an electronic data store configured to store; a transcription of an item of content comprising audio content; and a companion item of textual content, wherein the companion item of textual content comprises; a plurality of paragraphs of body text, and matter other than body text; and a physical computing device in communication with the electronic data store, the physical computing device configured to; identify, in the transcription, a portion of the transcription that includes text also included in a portion of the companion item of textual content; determine a level of correlation between words in the portion of the companion item of textual content and words in the portion of the transcription; determine that the level of correlation satisfies a threshold value; in response to determining that the level of correlation satisfies a threshold value, identify the portion of the companion item of content as body text; identify a second portion of the companion item of textual content that does not satisfy the threshold value with respect to any portion of the transcription; determine that the second portion of the companion item of textual content that does not satisfy the threshold value is back matter based at least in part on a determination that the second portion of the companion item of textual content that does not satisfy the threshold value appears within the companion item of textual content after a last portion of the companion item of textual content for which a corresponding portion of the transcription is identified; and generate content synchronization information that identifies (a) portions of the audio content that correspond to body text of the companion item of textual content and (b) further identifies the matter other than body text in the companion item, wherein the content synchronization information indicates that the matter other than body text in the companion item does not correspond to the audio content, wherein the matter other than body text includes the second portion of the companion item of textual content determined to be back matter, wherein the content synchronization information indicates that the portion of the companion item of textual content, excluding the matter other than body text, should be presented in synchronization with a portion of the audio content that corresponds to the portion of the transcription. - View Dependent Claims (12, 13, 14, 15, 16, 18, 19, 21)
-
Specification