System and method for hindering undesired transmission or receipt of electronic messages
First Claim
1. A method of hindering an undesirable transmission or receipt of electronic messages within a network of users, comprising the steps of:
- determining that transmission or receipt of at least one specific electronic message is undesirable;
automatically extracting detection data that permits detection of the at least one specific electronic message or variants thereof, wherein said automatically extracting includes automatically ideniitifying and storing a text string signature contained within an undesirable electronic message, said text string signature being statistically unlikely to be found in desirable electronic messages;
scanning one or more inbound and outbound messages from at least one user for the presence of the at least one specific electronic message or variants thereof wherein said scanning includes searching for said text string signature within said inbound and outbound messages; and
taking appropriate action, responsive to the scanning step.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method of hindering an undesirable transmission or receipt of electronic messages within a network of users includes the steps of determining that transmission or receipt of at least one specific electronic message is undesirable; automatically extracting detection data that permits detection of the at least one specific electronic message or variants thereof; scanning one or more inbound and/or outbound messages from at least one user for the presence of the at least one specific electronic message or variants thereof; and taking appropriate action, responsive to the scanning step.
372 Citations
42 Claims
-
1. A method of hindering an undesirable transmission or receipt of electronic messages within a network of users, comprising the steps of:
-
determining that transmission or receipt of at least one specific electronic message is undesirable;
automatically extracting detection data that permits detection of the at least one specific electronic message or variants thereof, wherein said automatically extracting includes automatically ideniitifying and storing a text string signature contained within an undesirable electronic message, said text string signature being statistically unlikely to be found in desirable electronic messages;
scanning one or more inbound and outbound messages from at least one user for the presence of the at least one specific electronic message or variants thereof wherein said scanning includes searching for said text string signature within said inbound and outbound messages; and
taking appropriate action, responsive to the scanning step. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 27, 28, 29, 30, 31)
extracting a message body;
transforming the message body into an invariant form;
scanning the invariant form for exact or near matches to the detection data; and
determining, for each match, a level of match.
-
-
28. The method of claim 1 wherein the taking step comprises the step of taking appropriate action, upon discovering the presence of the at least one specific electronic message or variants thereof.
-
29. The method of claim 28 wherein the taking step comprises the step of labeling the at least one specific electronic message or variants thereof as undesirable or confidential.
-
30. The method of claim 28 wherein the taking step comprises the step of removing the at least one specific electronic message or variants thereof.
-
31. The method of claim 27 wherein the taking step comprises the step of taking appropriate action for each determined level of match, responsive to one or more user preferences.
-
14. A method of hindering an undesirable transmission or receipt of electronic messages within a network of users, comprising the steps of:
-
determining that transmission or receipt of at least one specific electronic the message is undesirable;
automatically extracting detection data that permits detection of the at least one specific electronic message or variants thereof;
scanning one or more inbound and/or outbound messages from at least one user for the presence of the at least one specific electronic message or variants thereof;
taking appropriate action, responsive to the scanning step; and
storing the extracted detection data, wherein the extracting step comprises the step of extracting a signature from the at least one specific electronic message;
wherein the storing step comprises the step of storing the signature in at least one signature database;
wherein the signature database comprises a plurality of signature clusters, each cluster including data corresponding to substantially similar electronic messages;
wherein each of the signature clusters comprises a character sequence component having scanning information and an archetype component having identification information about particular signature variants; and
wherein the scanning information includes a search character sequence for a particular electronic message and extended character sequence information for all the electronic messages represented in the cluster and wherein the identification information includes a pointer to a full text stored copy of an electronic message relating to a particular signature variant, a hashblock of the electronic message, and alert data corresponding to specific instances where a copy of the electronic message was received and the proliferation of which was reported as undesirable by an alert user. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26)
scanning the specific electronic message for any signatures in the at least one signature database; and
comparing, responsive to finding a matching signature in the scanning step, the matching signature to each message variant in a matching cluster.
-
-
20. The method of claim 19 wherein the comparing step comprises the steps of:
-
computing a hashblock for the specific electronic message; and
comparing the computed hashblock with variant hashblocks in the identification information of each archetype component.
-
-
21. The method of claim 20 further comprising the steps of:
-
if an exact variant hashblock match is found, retrieving the full text stored copy of the variant match using the pointer, and if the full text stored copy of the variant match and the full text of the specific electronic message are deemed sufficiently similar to regard the specific electronic message as an instance of the variant, extracting alert data from the specific electronic message and adding it to the alert data for the variant match;
else if an exact variant hashblock match is not found or the full text of the specific electronic message is found to be insufficiently similar to any of the variants in the database, determining whether the specific electronic message is sufficiently similar to any existing cluster;
if the specific electronic message is sufficiently similar to an existing cluster, computing new identification information associated with specific electronic message;
else if the specific electronic message is not determined to be sufficiently similar to an existing cluster, creating a new cluster for the specific electronic message.
-
-
22. The method of claim 21 wherein the determining step comprises the steps of:
-
computing a checksum of a region of the specific electronic message indicated in the extended character sequence information for each cluster; and
comparing the computed checksum with a stored checksum in the extended character sequence information of each cluster.
-
-
23. The method of claim 19 further comprising the step of creating, if no signature match is found, a new cluster for the specific electronic message.
-
24. The method of claim 22 wherein the extended character sequence information includes a beginoffset field, a regionlength field and a CRC field, the method further comprising the steps of:
-
determining, for each cluster, a matching region with a longest regionlength; and
identifying, if the longest regionlength among all the clusters is at least equal to a specified threshold length, a longest regionlength cluster as an archetype cluster to which the specific electronic message archetype is to be added.
-
-
25. The method of claim 23 further comprising the step of recomputing the scanning information of the identified cluster.
-
26. The method of claim 14 wherein the alert data includes a receivetime field having a time at which a copy was originally received and wherein the method further comprises the steps of:
-
periodically comparing the receivetime field of all variants of each signature cluster with the current time; and
removing a signature cluster in which none of the receivetime fields are more recent than a predetermined date and time.
-
-
32. A method of hindering an undesirable transmission or receipt of electronic messages within a network of users, comprising the steps of:
-
determining that transmission or receipt of at least one specific electronic the message is undesirable;
automatically extracting detection data that permits detection of the at least one specific electronic message or variants thereof;
scanning one or more inbound and/or outbound messages from at least one user for the presence of the at least one specific electronic message or variants thereof; and
taking appropriate action, responsive to the scanning step;
wherein the scanning step comprises the steps of;
extracting a message body;
transforming the message body into an invariant form;
scanning the invariant form for exact or near matches to the detection data; and
determining, for each match, a level of match, and wherein the determining step comprises the steps of;
finding the longest regional matches for each match;
computing hashblock similarities between a hashblock of the scanned message and hashblocks of each of the extracted detection data;
receiving one or more user preferences; and
determining a level of match responsive to the finding, computing and receiving steps.
-
-
33. A program storage device, readable by a machine, tangibly embodying a program of instructions executable by the machine to perform method steps for hindering an undesirable transmission or receipt of electronic messages within a network of users, the method comprising the steps of:
-
determining that transmission or receipt of at least one specific electronic message is undesirable;
automatically extracting detection data that permits detection of the at least one specific electronic message or variants thereof, wherein said automatically extracting includes automatically identifying and storing a text string signature contained within an undesirable electronic message, said text string signature being statistically unlikely to be found in desirable electronic messages;
scanning one or more inbound and outbound messages from at least one user for the presence of the at least one specific electronic message or variants thereof wherein said scanning includes searching for said text string signature within said inbound and outbound messages; and
taking appropriate action, responsive to the scanning step.
-
-
34. A system for hindering an undesirable transmission or receipt of electronic messages within a network of users, comprising:
-
means for determining that transmission or receipt of at least one specific electronic message is undesirable;
means for automatically extracting detection data that permits detection of the at least one specific electronic message or variants thereof;
means for scanning one or more inbound and/or outbound messages from at least one user for the presence of the at least one specific, electronic message or variants thereof;
means for taking appropriate action, responsive to the scanning means, further comprising a means for storing the extracted detection data; and
means for storing the extracted detection data;
wherein the extracting means comprise means for extracting a signature from the at least one specific electronic message;
wherein the storing means comprise means for storing the signature in at least one signature database;
wherein the signature database comprises a plurality of signature clusters, each cluster including data corresponding to substantially similar electronic messages;
wherein each of the signature clusters comprises a character sequence component having scanning information and an archetype component having identification information about particular signature variants; and
wherein the scanning information includes a search character sequence for a particular electronic message and extended character sequence information for all the electronic messages represented in the cluster and wherein the identification information includes a pointer to a full text stored copy of an electronic message relating to a particular signature variant, a hashblock of the electronic message, and alert data corresponding to specific instances where a copy of the electronic message was received and the proliferation of which was reported as undesirable by an alert user. - View Dependent Claims (35, 36, 37, 38, 39, 40, 41)
means for scanning the specific electronic message for any signatures in the at least one signature database; and
means for comparing, responsive to finding a matching signature by the scanning means, the matching signature to each message variant in a matching cluster.
-
-
36. The system of claim 35 wherein the comparing means comprise:
-
means for computing a hashblock for the specific electronic message; and
means for comparing the computed hashblock with variant hashblocks in the identification information of each archetype component.
-
-
37. The system of claim 36 further comprising:
-
means, if an exact variant hashblock match is found, for retrieving the full text stored copy of the variant match using the pointer, means, if the full text stored copy of the variant match and the full text of the specific electronic message are deemed sufficiently similar to regard the specific electronic message as an instance of the variant, for extracting alert data from the specific electronic message and adding it to the alert data for the variant match; and
means, else if an exact variant hashblock match is not found or the full text of the specific electronic message is found to be insufficiently similar to any of the variants in the database, for determining whether the specific electronic message is sufficiently similar to any existing cluster;
means, if the specific electronic message is sufficiently similar to an existing cluster, for computing new identification information associated with specific electronic message; and
means, else if the specific electronic message is not determined to be sufficiently similar to an existing cluster, for creating a new cluster for the specific electronic message.
-
-
38. The system of claim 37 wherein the determining means comprise:
-
means for computing a checksum of a region of the specific electronic message indicated in the extended character sequence information for each cluster; and
means for comparing the computed checksum with a stored checksum in the extended character sequence information of each cluster.
-
-
39. The system of claim 35 further comprising means for creating, if no signature match is found, a new cluster for the specific electronic message.
-
40. The system of claim 38 wherein the extended character sequence information includes a beginoffset field, a regionlength field and a CRC field, the system further comprising:
-
means for determining, for each cluster, a matching region with a longest regionlength; and
means for identifying, if the longest regionlength among all the clusters is at least equal to a specified threshold length, a longest regionlength cluster as an archetype cluster to which the specific electronic message archetype is to be added.
-
-
41. The system of claim 39 further comprising means for recomputing the scanning information of the identified cluster.
-
42. A system for hindering an undesirable transmission or receipt of electronic messages within a network of users, comprising:
-
means for determining that transmission or receipt of at least one specific electronic message is undesirable;
means for automatically extracting detection data that permits detection of the at least one specific electronic message or variants thereof;
means for scanning one or more inbound and/or outbound messages from at least one user for the presence of the at least one specific electronic message or variants thereof;
means for taking appropriate action, responsive to the scanning means;
wherein the scanning means comprise;
means for extracting a message body;
means for transforming the message body into an invariant form;
means for scanning the invariant form for exact or near matches to the detection data; and
means for determining, for each match, a level of match, and wherein the determining means comprise;
means for finding the longest regional matches for each match;
means for computing hashblock similarities between a hashblock of the scanned message and hashblocks of each of the extracted detection data;
means for receiving one or more user preferences; and
means for determining a level of match responsive to the finding, computing and receiving steps.
-
Specification