System and method for secure real-time high accuracy speech to text conversion of general quality speech
First Claim
1. A system, comprising:
- an audio shredder receiving an audio segment, the audio segment being a portion of an audio stream, the audio shredder creating an audio shred from the audio segment;
an audio mixer receiving the audio shred and randomizing the audio shred with other audio shreds from other audio streams; and
a plurality of transcribers, wherein one of the transcribers receives the audio shred and transcribes the audio shred into text.
3 Assignments
0 Petitions
Accused Products
Abstract
A method, comprising the steps of receiving an audio stream, filtering the audio stream to separate identifiable words in the audio stream from unidentifiable words, creating a word text file for the identifiable words and storing the word text file in a database, the word text file including word indexing information. Creating audio segments from the audio stream, the audio segments including portions of the audio stream having unidentifiable words, creating audio shreds from the audio segments, the audio shreds including audio shred indexing information to identify each of the audio shreds and storing the audio shred indexing information in the database. Mixing the audio shreds with other audio shreds from other audio streams, delivering the audio shreds to a plurality of transcribers, transcribing each of the audio shreds into a corresponding audio shred text file, the audio shred text file including the audio shred indexing information corresponding to the audio shred from which the audio shred text file was created and reassembling the audio shred text files and the word text files into a conversation text file corresponding to the audio stream.
66 Citations
21 Claims
-
1. A system, comprising:
-
an audio shredder receiving an audio segment, the audio segment being a portion of an audio stream, the audio shredder creating an audio shred from the audio segment;
an audio mixer receiving the audio shred and randomizing the audio shred with other audio shreds from other audio streams; and
a plurality of transcribers, wherein one of the transcribers receives the audio shred and transcribes the audio shred into text. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method, comprising the steps of:
-
receiving an audio stream;
filtering the audio stream to separate identifiable words in the audio stream from unidentifiable words;
creating a word text file for the identifiable words;
storing the word text file in a database, the word text file including word indexing information;
creating audio segments from the audio stream, the audio segments including portions of the audio stream having unidentifiable words;
creating audio shreds from the audio segments, the audio shreds including audio shred indexing information to identify each of the audio shreds;
storing the audio shred indexing information in the database;
mixing the audio shreds with other audio shreds from other audio streams;
delivering the audio shreds to a plurality of transcribers;
transcribing each of the audio shreds into a corresponding audio shred text file, the audio shred text file including the audio shred indexing information corresponding to the audio shred from which the audio shred text file was created; and
reassembling the audio shred text files and the word text files into a conversation text file corresponding to the audio stream. - View Dependent Claims (13, 14, 15, 16, 17)
-
-
18. A system, comprising:
-
a service platform for receiving, processing and directing streaming audio; and
a user device connected to the service platform and configured to receive streaming audio from the service platform and transmit streaming audio to the service platform, the user device further configured to signal the service platform to begin a transcription of the streaming audio transmitted and received by the user device, wherein the service platform includes a filter receiving the streaming audio, identifying words within the streaming audio and creating a word text file corresponding to each of the identified words, the filter further creating audio segments from the streaming audio, the audio segments including portions of the audio stream having unidentifiable words, an audio shredder creating a plurality of audio shreds from each of the audio segments, an audio mixer randomizing the audio shreds with other audio shreds from other streaming audio, wherein the service platform delivers the randomized audio shreds to a plurality of transcribers which transcribe the audio shreds into audio shred text files corresponding to the audio shreds, a reassembler creating a conversation text file corresponding to the streaming audio from the audio shred text files and the word text files. - View Dependent Claims (19, 20)
-
-
21. A system, comprising:
-
an audio stream element including information corresponding to an audio stream, the information including a begin time of the audio stream, an end time of the audio stream, a conversation identification of the audio stream and the audio stream file;
a word element including information corresponding to a word identified in the audio stream by a speech recognition filter, the information including an identification of the audio stream from which the word was identified, a begin time of the word, an end time of the word, an audio file of the word and text corresponding to the word;
an audio segment element including information corresponding to an audio segment of the audio stream, the audio segment being a portion of the audio stream without identifiable words, the information including the identification of the audio stream from which the audio segment originates, the begin time of the audio segment, the end time of the audio segment and the audio file of the audio segment;
an audio shred element including information corresponding to an audio shred of the audio segment, the information including an identification of the audio segment from which the audio shred originates, the begin time of the audio shred, the end time of the audio shred and the audio file of the audio shred; and
a text token element including information corresponding to a textual representation of the audio shred, the information including an identification of the audio shred from which the textual representation originates and the textual representation, wherein the information included in each of the audio stream element, the word element, the audio segment element, the audio shred element and the text token element is processed to generate a text transcription of the audio stream.
-
Specification