SYSTEM FOR EXCLUDING UNWANTED DATA FROM A VOICE RECORDING
First Claim
1. A method for the preparation of a censored recording of audio data originating from a voice source in the form of either a live audio stream or a prior recording, such censored recording excluding censored portions of the original voice source comprising the steps of:
- a) receiving said audio data within a volatile random access memory of a computer;
b) searching the audio data within the volatile random access memory to identify target audio data for censoring; and
c) transcribing the audio data from within the volatile random access memory to a recording medium through a filter which omits transcription of such identified target audio data,wherein no durable or persistent version of the audio data reflecting the content of the voice source is created in the course of preparing the censored recording.
0 Assignments
0 Petitions
Accused Products
Abstract
An apparatus and method for the preparation of a censored recording of an audio source according to a procedure whereby no tangible, durable version of the original audio data is created in the course of preparing the censored record. Further, a method is provided for identifying target speech elements in a primary speech text by iteratively using portions of already identified target elements to locate further target elements that contain identical portions. The target speech elements, once identified, are removed from the primary speech text or rendered unintelligible to produce a censored record of the primary speech text. Copies of such censored primary speech text elements may be transmitted and stored with reduced security precautions.
-
Citations
20 Claims
-
1. A method for the preparation of a censored recording of audio data originating from a voice source in the form of either a live audio stream or a prior recording, such censored recording excluding censored portions of the original voice source comprising the steps of:
-
a) receiving said audio data within a volatile random access memory of a computer; b) searching the audio data within the volatile random access memory to identify target audio data for censoring; and c) transcribing the audio data from within the volatile random access memory to a recording medium through a filter which omits transcription of such identified target audio data, wherein no durable or persistent version of the audio data reflecting the content of the voice source is created in the course of preparing the censored recording.
-
-
2. A method for the preparation of a censored recording of audio data originating from a voice source in the form of either a live audio stream or a recording, such censored recording excluding censored portions of the original voice source, comprising the steps of:
-
a) receiving the audio data into a computer having a processor which places the audio data in a first audio version volatile memory for temporary storage as either analog or digitized audio data, such stored audio data being associated with time stamped markers to provide identification for the location of portions of the audio data; b) passing the audio data through a speech-to-text engine to produce a resulting full or partial “
text”
version of the audio data, wherein the audio text is identified as words including numbers or pauses which are associated with time stamped markers so as to associate such audio text with the stored audio data;c) identifying candidate target data for censoring in the audio data, wherein the “
candidate target data”
may include pauses, words including numbers and fragments thereof by comparison of the audio data with a pre-established set of characteristics for target data;d) identifying target data amongst candidate target data based upon pre-established characteristics for target data or based upon such pre-established characteristics and external context audio data in the form of validation terms that precede or follow the candidate target data; e) identifying further target data and associated time stamped markers using elements of previously found target data as dynamic word strings, and f) transcribing the audio data within the first volatile random access memory to a recording medium through a filter which omits transcription of such identified target audio data. - View Dependent Claims (3, 4, 5, 6, 7, 10, 11, 12)
-
- 8. The method as in claim 8 wherein candidate target data is initially identified as such based upon the presence of the pause occurring adjacent to the utterance of four numerals followed by the utterance of at least three numerals within one word from the pause.
-
13. A method for the preparation of a censored recording of audio data originating from a voice source in the form of either a live audio stream or a recording, such censored recording excluding censored portions of the original voice source, comprising the steps of:
-
a) receiving the audio data to into a computer having a processor which places the audio data in a first audio version volatile memory for temporary storage as either analog or digitized audio data, such stored audio data being associated with time stamped markers to provide identification for the location of portions of the audio data; b) passing the audio data through a speech-to-text engine to produce a resulting full or partial “
text”
version of the audio data, wherein the audio text is identified as words including numbers, or pauses which are associated with time stamped markers so as to associate such audio text with the stored audio data;c) identifying candidate target data for censoring in the audio data, wherein the “
candidate target data”
may include pauses, words, numbers, and fragments thereof by comparison of the audio data with a pre-established set of characteristics for target data;d) identifying target data amongst candidate target data based upon pre-established characteristics for target data or based upon such pre-established characteristics and external context audio data in the form of validation terms that precede or follow the candidate target data; and e) transcribing the audio data within the first volatile random access memory to a recording medium through a filter which omits transcription of such identified target audio data, wherein candidate target data is initially identified as such based upon the presence of a pause within the audio data. - View Dependent Claims (14, 15, 16, 17, 18, 19)
-
-
20. A method for the preparation of a censored recording of audio data originating from a voice source in the form of either a live audio stream or a recording, such censored recording excluding censored portions of the original voice source, the censored portions comprising number target data in the form of number strings, comprising the steps of:
-
a) receiving the audio data containing words and number target data in the form of number strings into a computer having a processor which places the audio data into a first audio version memory for storage as either analog or digitized audio data, such stored audio data being associated with time stamped markers to provide identification for the location of portions of the audio data; b) passing the audio data through a speech-to-text engine to produce a resulting full or partial audio “
text”
version of the audio data, wherein the audio text as identified includes number strings which may be of various lengths and wherein the number strings potentially erroneously contain one or more words interspersed between the numbers which words correspond to numbers in the corresponding string saved as part of the audio data in the first audio version memory, the audio text being associated with time stamped markers so as to associate such audio text with the stored audio data;c) identifying numeric target data in the form of said number strings for censoring in the audio data by comparison of the audio data with a pre-established size for such number strings in terms of the total number of words and numbers within the string, and d) transcribing the audio data within the first volatile random access memory to a recording medium through a filter which omits transcription of such identified numeric target data, wherein numeric target data is identified as such based upon the length of a given number string counting an interspersed word as if such word were a number.
-
Specification