Filtering sensitive information
First Claim
Patent Images
1. A non-transitory machine readable storage medium having instructions embodied thereon, the instructions being executable by one or more processors to perform acts comprising:
- obtaining a text representation for a first audio block in a stream of audio blocks using a speech-to-text application, wherein the stream of audio blocks represents a conversation between at least two persons;
analyzing the text representation for the first audio block to generate metadata for the first audio block, wherein the metadata includes timestamps for at least one phrase in the first audio block;
comparing the text representation for the first audio block to pattern rules to identify a first portion of sensitive information in the first audio block, wherein a timestamp for the first portion of the sensitive information is identified in the metadata for the first audio block;
determining that the sensitive information extends into a second portion of sensitive information in an adjacent audio block in the stream of audio blocks;
combining the first audio block with the adjacent audio block to form a composite audio block; and
removing a portion of audio data from the composite audio block that corresponds to the first portion of sensitive information in the first audio block and the second portion of sensitive information in the adjacent audio block while the conversation is occurring between the at least two persons, wherein the portion of audio data is removed in accordance with the timestamp for the first portion of sensitive information in the first audio block and a second timestamp for the second portion of sensitive information in the adjacent audio block.
2 Assignments
0 Petitions
Accused Products
Abstract
Technology is described for removing sensitive information. An audio block that represents a portion of a conversation may be identified. A text representation for the audio block may be obtained using a speech-to-text process. The text representation for the audio block may be compared to pattern rules to mark sensitive information in the audio block. A portion of audio data from the audio block marked as sensitive information may be removed in the audio block.
23 Citations
20 Claims
-
1. A non-transitory machine readable storage medium having instructions embodied thereon, the instructions being executable by one or more processors to perform acts comprising:
-
obtaining a text representation for a first audio block in a stream of audio blocks using a speech-to-text application, wherein the stream of audio blocks represents a conversation between at least two persons; analyzing the text representation for the first audio block to generate metadata for the first audio block, wherein the metadata includes timestamps for at least one phrase in the first audio block; comparing the text representation for the first audio block to pattern rules to identify a first portion of sensitive information in the first audio block, wherein a timestamp for the first portion of the sensitive information is identified in the metadata for the first audio block; determining that the sensitive information extends into a second portion of sensitive information in an adjacent audio block in the stream of audio blocks; combining the first audio block with the adjacent audio block to form a composite audio block; and removing a portion of audio data from the composite audio block that corresponds to the first portion of sensitive information in the first audio block and the second portion of sensitive information in the adjacent audio block while the conversation is occurring between the at least two persons, wherein the portion of audio data is removed in accordance with the timestamp for the first portion of sensitive information in the first audio block and a second timestamp for the second portion of sensitive information in the adjacent audio block. - View Dependent Claims (2, 3)
-
-
4. A method, using one or more processors, comprising:
-
obtaining a text representation for a first audio block in a stream of audio blocks using a speech-to-text process, wherein the stream of audio blocks represents a conversation between at least two persons; comparing the text representation for the first audio block to pattern rules to identify a first portion of target information in the first audio block; marking the first portion of the target information in the first audio block; determining that the target information extends into a second portion of sensitive information in an adjacent audio block in the stream of audio blocks; combining the first audio block with the adjacent audio block to form a composite audio block; and removing a portion of audio data from the composite audio block marked as the first portion of target information and the second portion of target information in the adjacent audio block while the conversation is occurring between the at least two persons. - View Dependent Claims (5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A system, comprising:
-
at least one processor; at least one memory device including a data store to store a plurality of data and instructions that, when executed, cause the system to; receive a first audio blocks in a stream of audio blocks that represents a conversation between at least two persons; generate a text representation of the first audio blocks using a speech-to-text service; analyze the text representation to generate metadata for the first audio blocks, wherein the metadata includes a timestamp for a phrase in the first audio blocks; compare the text representation to pattern rules to identify a first portion of sensitive information in the first audio blocks, wherein the timestamp for the first portion of the sensitive information is identified in the metadata for the first audio blocks; determine that the sensitive information extends into a second portion of sensitive information in an adjacent audio block in the stream of audio blocks; combine the first audio block with the adjacent audio block to form a composite audio block; and remove a portion of audio data from the composite audio blocks that contains the first portion of sensitive information and the second portion of sensitive information while the conversation is occurring between the at least two persons. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
Specification