System and method for secure real-time high accuracy speech to text conversion of general quality speech
First Claim
1. A system, comprising:
- an audio shredder receiving an audio segment, the audio segment being a portion of an audio stream, the audio shredder creating an audio shred from the audio segment;
an audio mixer receiving the audio shred and randomizing the audio shred with other audio shreds from other audio streams; and
a plurality of transcribers, wherein one of the transcribers receives the audio shred and transcribes the audio shred into text.
3 Assignments
0 Petitions
Accused Products
Abstract
A method, comprising the steps of receiving an audio stream, filtering the audio stream to separate identifiable words in the audio stream from unidentifiable words, creating a word text file for the identifiable words and storing the word text file in a database, the word text file including word indexing information. Creating audio segments from the audio stream, the audio segments including portions of the audio stream having unidentifiable words, creating audio shreds from the audio segments, the audio shreds including audio shred indexing information to identify each of the audio shreds and storing the audio shred indexing information in the database. Mixing the audio shreds with other audio shreds from other audio streams, delivering the audio shreds to a plurality of transcribers, transcribing each of the audio shreds into a corresponding audio shred text file, the audio shred text file including the audio shred indexing information corresponding to the audio shred from which the audio shred text file was created and reassembling the audio shred text files and the word text files into a conversation text file corresponding to the audio stream.
-
Citations
21 Claims
-
1. A system, comprising:
-
an audio shredder receiving an audio segment, the audio segment being a portion of an audio stream, the audio shredder creating an audio shred from the audio segment;
an audio mixer receiving the audio shred and randomizing the audio shred with other audio shreds from other audio streams; and
a plurality of transcribers, wherein one of the transcribers receives the audio shred and transcribes the audio shred into text. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
a reassembler receiving the text corresponding to the audio shred and combining the text with other text corresponding to the audio stream to create a text file corresponding to the audio stream.
-
-
3. The system of claim 2, wherein the text and the other text includes indexing information, the reassembler using the indexing information to create the text file.
-
4. The system of claim 1, further comprising:
a delivery module to deliver the text file corresponding to the audio stream.
-
5. The system of claim 4, wherein the delivery module is one of a display screen and a storage medium.
-
6. The system of claim 1, further comprising:
a filter receiving the audio stream, identifying words within the audio stream and creates a word text file corresponding to each of the identified words, the filter creating the audio segment from a portion of the audio stream having words which are unidentifiable by the filter.
-
7. The system of claim 6, further comprising:
a database element which stores the word text file corresponding to each of the identified words, the database element further storing indexing information corresponding to the audio shred.
-
8. The system of claim 1, wherein the audio stream is one of a voice recording and a real-time conversation.
-
9. The system of claim 1, wherein the audio shred is a plurality of audio shreds and wherein a portion of a first audio shred overlaps a portion of a second audio shred.
-
10. The system of claim 9, wherein the first audio shred is transcribed by a first transcriber and the second audio shred is transcribed by a second transcriber and the overlapping portions of the first and second audio shreds are compared for accuracy.
-
11. The system of claim 1, further comprising:
a transcriber control element to monitor the availability of each of the transcribers and directing the audio shred to an available transcriber.
-
12. A method, comprising the steps of:
-
receiving an audio stream;
filtering the audio stream to separate identifiable words in the audio stream from unidentifiable words;
creating a word text file for the identifiable words;
storing the word text file in a database, the word text file including word indexing information;
creating audio segments from the audio stream, the audio segments including portions of the audio stream having unidentifiable words;
creating audio shreds from the audio segments, the audio shreds including audio shred indexing information to identify each of the audio shreds;
storing the audio shred indexing information in the database;
mixing the audio shreds with other audio shreds from other audio streams;
delivering the audio shreds to a plurality of transcribers;
transcribing each of the audio shreds into a corresponding audio shred text file, the audio shred text file including the audio shred indexing information corresponding to the audio shred from which the audio shred text file was created; and
reassembling the audio shred text files and the word text files into a conversation text file corresponding to the audio stream. - View Dependent Claims (13, 14, 15, 16, 17)
-
-
18. A system, comprising:
-
a service platform for receiving, processing and directing streaming audio; and
a user device connected to the service platform and configured to receive streaming audio from the service platform and transmit streaming audio to the service platform, the user device further configured to signal the service platform to begin a transcription of the streaming audio transmitted and received by the user device, wherein the service platform includes a filter receiving the streaming audio, identifying words within the streaming audio and creating a word text file corresponding to each of the identified words, the filter further creating audio segments from the streaming audio, the audio segments including portions of the audio stream having unidentifiable words, an audio shredder creating a plurality of audio shreds from each of the audio segments, an audio mixer randomizing the audio shreds with other audio shreds from other streaming audio, wherein the service platform delivers the randomized audio shreds to a plurality of transcribers which transcribe the audio shreds into audio shred text files corresponding to the audio shreds, a reassembler creating a conversation text file corresponding to the streaming audio from the audio shred text files and the word text files. - View Dependent Claims (19, 20)
-
-
21. A system, comprising:
-
an audio stream element including information corresponding to an audio stream, the information including a begin time of the audio stream, an end time of the audio stream, a conversation identification of the audio stream and the audio stream file;
a word element including information corresponding to a word identified in the audio stream by a speech recognition filter, the information including an identification of the audio stream from which the word was identified, a begin time of the word, an end time of the word, an audio file of the word and text corresponding to the word;
an audio segment element including information corresponding to an audio segment of the audio stream, the audio segment being a portion of the audio stream without identifiable words, the information including the identification of the audio stream from which the audio segment originates, the begin time of the audio segment, the end time of the audio segment and the audio file of the audio segment;
an audio shred element including information corresponding to an audio shred of the audio segment, the information including an identification of the audio segment from which the audio shred originates, the begin time of the audio shred, the end time of the audio shred and the audio file of the audio shred; and
a text token element including information corresponding to a textual representation of the audio shred, the information including an identification of the audio shred from which the textual representation originates and the textual representation, wherein the information included in each of the audio stream element, the word element, the audio segment element, the audio shred element and the text token element is processed to generate a text transcription of the audio stream.
-
Specification