Content-Based Audio Playback Emphasis
First Claim
1. A method performed by a computer processor executing computer program instructions tangibly stored on a first computer-readable medium to perform a method comprising:
- (A) deriving, from a region of a document and a corresponding region of a spoken audio stream, a likelihood that the region of the document correctly represents content in the corresponding region of the spoken audio stream, and tangibly storing a representation of the likelihood in a second computer-readable medium;
(B) selecting a measure of relevance of the region of the spoken audio stream, the measure of relevance representing a measure of importance that the region of the spoken audio stream be brought to the attention of a human proofreader, and tangibly storing a representation of the measure of relevance in a third computer-readable medium; and
(C) deriving, from the stored representation of the likelihood and the stored representation of the measure of relevance, an emphasis factor for modifying emphasis placed on the region of the spoken audio stream when played back, and storing a representation of the emphasis factor in a fourth computer-readable medium.
11 Assignments
0 Petitions
Accused Products
Abstract
Techniques are disclosed for facilitating the process of proofreading draft transcripts of spoken audio streams. In general, proofreading of a draft transcript is facilitated by playing back the corresponding spoken audio stream with an emphasis on those regions in the audio stream that are highly relevant or likely to have been transcribed incorrectly. Regions may be emphasized by, for example, playing them back more slowly than regions that are of low relevance and likely to have been transcribed correctly. Emphasizing those regions of the audio stream that are most important to transcribe correctly and those regions that are most likely to have been transcribed incorrectly increases the likelihood that the proofreader will accurately correct any errors in those regions, thereby improving the overall accuracy of the transcript.
-
Citations
42 Claims
-
1. A method performed by a computer processor executing computer program instructions tangibly stored on a first computer-readable medium to perform a method comprising:
-
(A) deriving, from a region of a document and a corresponding region of a spoken audio stream, a likelihood that the region of the document correctly represents content in the corresponding region of the spoken audio stream, and tangibly storing a representation of the likelihood in a second computer-readable medium; (B) selecting a measure of relevance of the region of the spoken audio stream, the measure of relevance representing a measure of importance that the region of the spoken audio stream be brought to the attention of a human proofreader, and tangibly storing a representation of the measure of relevance in a third computer-readable medium; and (C) deriving, from the stored representation of the likelihood and the stored representation of the measure of relevance, an emphasis factor for modifying emphasis placed on the region of the spoken audio stream when played back, and storing a representation of the emphasis factor in a fourth computer-readable medium. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
-
-
27. An apparatus comprising a computer-readable medium tangibly storing instructions executable by a computer processor to perform a method comprising:
-
(A) deriving, from a region of a document and a corresponding region of a spoken audio stream, a likelihood that the region of the document correctly represents content in the corresponding region of the spoken audio stream; (B) selecting a measure of relevance of the region of the spoken audio stream, the measure of relevance representing a measure of importance that the region of the spoken audio stream be brought to the attention of a human proofreader; and (C) deriving, from the likelihood and the measure of relevance, an emphasis factor for modifying emphasis placed on the region of the spoken audio stream when played back. - View Dependent Claims (28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42)
-
Specification