Content-based audio playback emphasis
First Claim
1. A method performed by a computer processor executing computer program instructions tangibly stored on a first computer-readable medium to perform a method comprising:
- (A) deriving, from a region of a document and a corresponding region of a spoken audio stream, a likelihood score representing a likelihood that the region of the document correctly represents content in the corresponding region of the spoken audio stream, and tangibly storing a representation of the likelihood score in a second computer-readable medium;
(B) selecting a relevance score representing a measure of relevance of the region of the spoken audio stream, the measure of relevance representing a measure of importance that the region of the spoken audio stream be brought to the attention of a human proofreader, and tangibly storing a representation of the relevance score in a third computer-readable medium; and
(C) deriving, by dividing the relevance score by the likelihood score, an emphasis factor for modifying emphasis placed on the region of the spoken audio stream when played back, and storing a representation of the emphasis factor in a fourth computer-readable medium.
11 Assignments
0 Petitions
Accused Products
Abstract
Techniques are disclosed for facilitating the process of proofreading draft transcripts of spoken audio streams. In general, proofreading of a draft transcript is facilitated by playing back the corresponding spoken audio stream with an emphasis on those regions in the audio stream that are highly relevant or likely to have been transcribed incorrectly. Regions may be emphasized by, for example, playing them back more slowly than regions that are of low relevance and likely to have been transcribed correctly. Emphasizing those regions of the audio stream that are most important to transcribe correctly and those regions that are most likely to have been transcribed incorrectly increases the likelihood that the proofreader will accurately correct any errors in those regions, thereby improving the overall accuracy of the transcript.
-
Citations
44 Claims
-
1. A method performed by a computer processor executing computer program instructions tangibly stored on a first computer-readable medium to perform a method comprising:
-
(A) deriving, from a region of a document and a corresponding region of a spoken audio stream, a likelihood score representing a likelihood that the region of the document correctly represents content in the corresponding region of the spoken audio stream, and tangibly storing a representation of the likelihood score in a second computer-readable medium; (B) selecting a relevance score representing a measure of relevance of the region of the spoken audio stream, the measure of relevance representing a measure of importance that the region of the spoken audio stream be brought to the attention of a human proofreader, and tangibly storing a representation of the relevance score in a third computer-readable medium; and (C) deriving, by dividing the relevance score by the likelihood score, an emphasis factor for modifying emphasis placed on the region of the spoken audio stream when played back, and storing a representation of the emphasis factor in a fourth computer-readable medium. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
-
-
26. The method of 1, further comprising:
-
(D) before (C), identifying a default playback rate of the spoken audio stream; and (E) identifying an emphasized playback rate of the spoken audio stream as a quotient of the default playback rate and the emphasis factor. - View Dependent Claims (27)
-
-
28. An apparatus comprising a computer-readable medium tangibly storing instructions executable by a computer processor to perform a method comprising:
-
(A) deriving, from a region of a document and a corresponding region of a spoken audio stream, a likelihood score representing a likelihood that the region of the document correctly represents content in the corresponding region of the spoken audio stream; (B) selecting a relevance score representing a measure of relevance of the region of the spoken audio stream, the measure of relevance representing a measure of importance that the region of the spoken audio stream be brought to the attention of a human proofreader; and (C) deriving, by dividing the relevance score by the likelihood score, an emphasis factor for modifying emphasis placed on the region of the spoken audio stream when played back. - View Dependent Claims (29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42)
-
-
43. The apparatus of 28, wherein the method further comprises:
-
(D) before (C), identifying a default playback rate of the spoken audio stream; and (E) identifying an emphasized playback rate of the spoken audio stream as a quotient of the default playback rate and the emphasis factor. - View Dependent Claims (44)
-
Specification