Web-based audio transcription tool
First Claim
1. A computer-implemented method, comprising:
- generating, at a server having one or more processors, an image representing audio content;
providing, from the server, the image and the audio content to a plurality of client devices, the image being provided for display along a vertical axis on a display of each of the client devices;
receiving, at the server, a first post from a first client device of the plurality of client devices, the first post including a first identifier indicating (i) a first position along the vertical axis of the image, and (ii) a first text portion representative of at least a portion of the audio content at the first position, the first text portion being entered by a first user of the first client device;
receiving, at the server, a second post from a second client device of the plurality of client devices, the second post including a second identifier indicating (i) a second position along the vertical axis of the image, and (ii) a second text portion representative of at least a portion of the audio content at the second position, the second text portion being entered by a second user of the second client device;
synchronizing, at the server, the first and second posts based on the first and second identifiers;
correlating, at the server, the first and second posts to provide a single transcription of the audio content,receiving, at the server, a command to zoom in on a portion of the image from the first client device;
generating, at the server, a second image in response to receiving the command, the second image representing an enlargement of the portion of the image;
providing, from the server, the second image to the first client device for display along the vertical axis on the display of the first client device;
receiving, at the server, a third post from the first client device, the third post including a third identifier indicating (i) a third position along the vertical axis of the second image, and (ii) a third text portion representative of at least a portion of the audio content at the third position, the third text portion being entered by the first user of the first client device; and
synchronizing, at the server, the first, second and third posts based on the first, second and third identifiers,wherein correlating the first and second posts to provide the single transcription of the audio content includes correlating the first, second and third posts to provide the single transcription.
2 Assignments
0 Petitions
Accused Products
Abstract
A computer-implemented technique for transcribing audio data includes generating, along a vertical axis on a display of a client device, an image representing audio content. The technique further includes receiving, from a user of the client device, a selection of a portion of the image; and generating, via an audio module of the client device, an audio output corresponding to the selected portion of the image. The technique further includes receiving, from the user, a selection indicating a position along the vertical axis on the display to enter a text portion representing the audio output, wherein the position is aligned to the selected portion of the image. The technique further includes receiving, from the user, the text portion representing the audio output; and displaying, on the display, the text portion at the position, wherein the text portion extends along a horizontal axis on the display.
-
Citations
18 Claims
-
1. A computer-implemented method, comprising:
-
generating, at a server having one or more processors, an image representing audio content; providing, from the server, the image and the audio content to a plurality of client devices, the image being provided for display along a vertical axis on a display of each of the client devices; receiving, at the server, a first post from a first client device of the plurality of client devices, the first post including a first identifier indicating (i) a first position along the vertical axis of the image, and (ii) a first text portion representative of at least a portion of the audio content at the first position, the first text portion being entered by a first user of the first client device; receiving, at the server, a second post from a second client device of the plurality of client devices, the second post including a second identifier indicating (i) a second position along the vertical axis of the image, and (ii) a second text portion representative of at least a portion of the audio content at the second position, the second text portion being entered by a second user of the second client device; synchronizing, at the server, the first and second posts based on the first and second identifiers; correlating, at the server, the first and second posts to provide a single transcription of the audio content, receiving, at the server, a command to zoom in on a portion of the image from the first client device; generating, at the server, a second image in response to receiving the command, the second image representing an enlargement of the portion of the image; providing, from the server, the second image to the first client device for display along the vertical axis on the display of the first client device; receiving, at the server, a third post from the first client device, the third post including a third identifier indicating (i) a third position along the vertical axis of the second image, and (ii) a third text portion representative of at least a portion of the audio content at the third position, the third text portion being entered by the first user of the first client device; and synchronizing, at the server, the first, second and third posts based on the first, second and third identifiers, wherein correlating the first and second posts to provide the single transcription of the audio content includes correlating the first, second and third posts to provide the single transcription. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer-implemented method, comprising:
-
generating, at a server having one or more processors, a first image representing audio content; providing, from the server, the first image and the audio content to a plurality of client devices, the first image being provided for display along a vertical axis on a display of each of the client devices; receiving, at the server, a first post from a first client device of the plurality of client devices, the first post including a first identifier indicating (i) a first position along the vertical axis of the first image, and (ii) a first text portion representative of at least a portion of the audio content at the first position, the first text portion being entered by a first user of the first client device; receiving, at the server, a command to zoom in on a portion of the first image from a second client device of the plurality of client devices; generating, at the server, a second image in response to receiving the command, the second image representing an enlargement of the portion of the first image; providing, from the server, the second image to a second client device for display along the vertical axis on the display of the second client device; receiving, at the server, a second post from the second client device, the second post including a second identifier indicating (i) a second position along the vertical axis of the second image, and (ii) a second text portion representative of at least a portion of the audio content at the second position, the second text portion being entered by a second user of the second client device; synchronizing, at the server, the first and second posts based on the first and second identifiers; and correlating, at the server, the first and second posts to provide a single transcription of the audio content. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A non-transitory computer-readable storage medium storing computer executable code that, when executed by a computing device having one or more processors, cause the computing device to perform operations comprising:
-
generating an image representing audio content; providing the image and the audio content to a plurality of client devices, the image being provided for display along a vertical axis on a display of each of the client devices; receiving a first post from a first client device of the plurality of client devices, the first post including a first identifier indicating (i) a first position along the vertical axis of the image, and (ii) a first text portion representative of at least a portion of the audio content at the first position, the first text portion being entered by a first user of the first client device; receiving a second post from a second client device of the plurality of client devices, the second post including a second identifier indicating (i) a second position along the vertical axis of the image, and (ii) a second text portion representative of at least a portion of the audio content at the second position, the second text portion being entered by a second user of the second client device; synchronizing the first and second posts based on the first and second identifiers; correlating the first and second posts to provide a single transcription of the audio content; receiving a command to zoom in on a portion of the image from the first client device; generating a second image in response to receiving the command, the second image representing an enlargement of the portion of the image; providing the second image to the first client device for display along the vertical axis on the display of the first client device; receiving a third post from the first client device, the third post including a third identifier indicating (i) a third position along the vertical axis of the second image, and (ii) a third text portion representative of at least a portion of the audio content at the third position, the third text portion being entered by the first user of the first client device; and synchronizing the first, second and third posts based on the first, second and third identifiers, wherein correlating the first and second posts to provide the single transcription of the audio content includes correlating the first, second and third posts to provide the single transcription. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification