System and method for the secure, real-time, high accuracy conversion of general-quality speech into text
First Claim
1. A system, comprising:
- a receiving element to receive audio segments which are portions of audio streams, the receiving element creating sub-segments from the audio segments;
a mixing element receiving the sub-segments from the audio streams and randomizing the sub-segments;
a transmitting element sending the randomized sub-segments to a plurality of transcribers, each of the randomized sub-segments being transcribed into text by the transcriber which received the randomized sub-segment; and
a text receiving element receiving the transcribed text from each of the transcribers.
3 Assignments
0 Petitions
Accused Products
Abstract
Described is a speech-to-text conversion system and method that provides secure, real-time and high-accuracy conversion of general-quality speech into text. The system is designed to interface with external devices and services, providing a simple and convenient manner to transcribe audio that may be stored elsewhere such as a wireless phone'"'"'s voice mail, or occurring between two or more parties such as a conference call. The first step in the system'"'"'s process ensures secure and private transcription by separating an audio stream into many audio shreds, each of which has duration of only a few seconds and cannot reveal the context of the conversation. A workforce of geographically distributed transcription agents who transcribe the audio shreds is able to generate transcription in real time, with many agents working in parallel on a single conversation. No one agent (or group of agents) receives a sufficient number of audio shreds to reconstruct the context of any conversation. The use of human transcribers allows the system to overcome limitations typical of computer-based speech recognition and permits accurate transcription of general-quality speech even in acoustically hostile environments.
139 Citations
29 Claims
-
1. A system, comprising:
-
a receiving element to receive audio segments which are portions of audio streams, the receiving element creating sub-segments from the audio segments;
a mixing element receiving the sub-segments from the audio streams and randomizing the sub-segments;
a transmitting element sending the randomized sub-segments to a plurality of transcribers, each of the randomized sub-segments being transcribed into text by the transcriber which received the randomized sub-segment; and
a text receiving element receiving the transcribed text from each of the transcribers. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
-
23. A method, comprising the steps of:
-
receiving audio segments which are portions of audio streams;
creating sub-segments from the audio segments;
randomizing the sub-segments;
sending the randomized sub-segments to a plurality of transcribers, each of the randomized sub-segments being transcribed into text by the transcriber which received the randomized sub-segment; and
receiving the transcribed text from each of the transcribers. - View Dependent Claims (24, 25, 26, 27)
-
-
28. A method, comprising the steps of:
-
receiving an original audio file;
receiving a computer-generated speech-to-text file corresponding to the original audio file; and
comparing the computer-generated speech-to-text file with the original audio file. - View Dependent Claims (29)
-
Specification