Systems and methods for efficient detection of fingerprinted data and information
First Claim
Patent Images
1. A system for managing potential transmission of a plurality of electronic documents over a network, the system comprising:
- hardware electronic processing means for generating a repository of compact fingerprints comprising a plurality of compact fingerprints, wherein each of the compact fingerprints is generated by;
selecting a subset of the plurality of electronic documents to which a particular transmission policy is applied,hashing the subset of the plurality of electronic documents to generate a plurality of corresponding hashes, andgenerating a compact fingerprint based on the plurality of hashes;
hardware electronic processing means for identifying an electronic document on the electronic network with a scanning engine;
hardware electronic processing means for generating a plurality of new fingerprints of the identified electronic document;
hardware electronic processing means for determining probabilistic matches between the plurality of new fingerprints of the identified electronic document and compact fingerprints stored in the compact fingerprint repository; and
hardware electronic processing means for determining whether to transmit the identified electronic document over the electronic network based, at least in part on a number of matching fingerprints.
18 Assignments
0 Petitions
Accused Products
Abstract
The disclosed embodiments provide systems, methods, and apparatus for efficient detection of fingerprinted content and relate generally to the field of information (or data) leak prevention. Particularly, a compact and efficient repository of fingerprint ingredients is used to analyze content and determine the content'"'"'s similarity to previously fingerprinted content. Some embodiments employ probabilistic indications regarding the existence of fingerprint ingredients in the repository.
394 Citations
15 Claims
-
1. A system for managing potential transmission of a plurality of electronic documents over a network, the system comprising:
-
hardware electronic processing means for generating a repository of compact fingerprints comprising a plurality of compact fingerprints, wherein each of the compact fingerprints is generated by; selecting a subset of the plurality of electronic documents to which a particular transmission policy is applied, hashing the subset of the plurality of electronic documents to generate a plurality of corresponding hashes, and generating a compact fingerprint based on the plurality of hashes; hardware electronic processing means for identifying an electronic document on the electronic network with a scanning engine; hardware electronic processing means for generating a plurality of new fingerprints of the identified electronic document; hardware electronic processing means for determining probabilistic matches between the plurality of new fingerprints of the identified electronic document and compact fingerprints stored in the compact fingerprint repository; and hardware electronic processing means for determining whether to transmit the identified electronic document over the electronic network based, at least in part on a number of matching fingerprints. - View Dependent Claims (2, 3, 4)
-
-
5. A system for applying a transmission policy to electronic content transmitted over a network, the system comprising:
-
one or more hardware electronic processors configured to; identify electronic content via a scanning engine, the electronic content comprising a plurality of electronic documents; generate a plurality of compact fingerprints of the identified electronic content identified by the scanning engine, each compact fingerprint generated by; selecting a subset of the plurality of electronic documents to which a particular transmission policy is applied, hashing the subset of the plurality of electronic documents to generate a plurality of corresponding hashes, and generating the compact fingerprint based on the plurality of hashes; store the compact fingerprints in a compact fingerprint repository; hash second electronic content identified on the network; determine probabilistic matches between the hashes and the compact fingerprints stored in the compact fingerprint repository; identify a transmission policy for the second electronic content identified on the network based, at least in part, on a compact fingerprint matching the hashes that corresponds to the identified transmission policy; and apply the identified transmission policy to the second electronic content identified on the network. - View Dependent Claims (6, 7)
-
-
8. A method of transmitting electronic content over a network, the method comprising:
-
performing the following on one or more hardware electronic processors; identifying electronic content on the electronic network with a scanning engine, the electronic content comprising a plurality of documents; generating a plurality of compact fingerprints of the electronic content, each compact fingerprint generated by; selecting a subset of the plurality of electronic documents to which a particular transmission policy is applied, hashing the subset of the plurality of electronic documents to generate a plurality of corresponding hashes, and generating the compact fingerprint based on the plurality of hashes; storing the generated compact fingerprints in a compact fingerprint repository; identifying second electronic content on the network; generating hashes of the second electronic content identified on the network; and determining whether to transmit the second electronic content identified on the network over the network, based at least in part on a transmission policy corresponding to a compact fingerprint stored in the compact fingerprint repository matching the generated hashes. - View Dependent Claims (9, 10, 11)
-
-
12. A non-transitory computer-readable medium comprising code configured to cause one or more processors to perform a method of transmitting electronic content over a network, the method comprising:
-
identifying electronic content on the electronic network, the electronic content comprising a plurality of documents; generating a plurality of fingerprints for each of the plurality of documents; generating a plurality of compact fingerprints for a plurality of corresponding transmission policies, each compact fingerprint generated by; selecting a subset of the plurality of electronic documents to which a particular transmission policy is applied, hashing the subset of the plurality of electronic documents to generate a plurality of corresponding hashes, and generating the compact fingerprint based on the plurality of hashes; storing the plurality of compact fingerprints in a compact fingerprint repository; determining probabilistic matches between fingerprints of second electronic content and the compact fingerprints stored in the fingerprint repository; and determining whether to transmit the second electronic content over the electronic network based at least in part, on a compact fingerprint matching fingerprints of the second electronic content. - View Dependent Claims (13, 14, 15)
-
Specification