Apparatus for preventing automatic generation of a chain reaction of messages if a prior extracted message is similar to current processed message
First Claim
1. A method of preventing maelstroms in computers and computer networks populated by entities capable of outputting messages automatically in response to incoming messages, comprising steps of:
- for each message processed by an entity that has the potential to trigger an occurrence of a maelstrom, extracting information from the message for permitting that message or a similar message to be recognized;
storing the extracted information for that message such that it is accessible by the entity;
for each further message processed by the entity, comparing a currently processed message against the stored information for a set of messages that have been previously processed by the entity;
if the stored information for a previously processed message matches exactly to the information extracted from the currently processed message, preventing the currently processed message from triggering the generation of a new message; and
if the stored information for a previously processed message similarly matches, but does not exactly match, the information extracted from the currently processed message, preventing the currently processed message from triggering the generation of a new message.
2 Assignments
0 Petitions
Accused Products
Abstract
A digital data processing system is provided with an information extracting portion or step for extracting information from each message processed by an entity of the system, where the extracted information permits that message or a similar message to be recognized. The system further includes a storage portion or step of storing the extracted information in a database of extracted information. The database has the extracted information for each message stored in an entry associated with the message. The invention further provides a comparison portion or step for comparing each message received or originated by the entity against the database entries stored in the storage segment and, if an entry is found to be sufficiently similar to the received message, for preventing the received message from triggering the generation and forwarding of a new message, thereby avoiding the creation of a network chain reaction or a maelstrom.
-
Citations
38 Claims
-
1. A method of preventing maelstroms in computers and computer networks populated by entities capable of outputting messages automatically in response to incoming messages, comprising steps of:
-
for each message processed by an entity that has the potential to trigger an occurrence of a maelstrom, extracting information from the message for permitting that message or a similar message to be recognized;
storing the extracted information for that message such that it is accessible by the entity;
for each further message processed by the entity, comparing a currently processed message against the stored information for a set of messages that have been previously processed by the entity;
if the stored information for a previously processed message matches exactly to the information extracted from the currently processed message, preventing the currently processed message from triggering the generation of a new message; and
if the stored information for a previously processed message similarly matches, but does not exactly match, the information extracted from the currently processed message, preventing the currently processed message from triggering the generation of a new message. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 10)
-
-
9. A method of preventing maelstroms in computers and computer networks populated by entities capable of outputting messages automatically in response to incoming messages, comprising steps of:
-
for each message processed by an entity that may potentially be forwarded to another entity, extracting information from the message for permitting that message or a similar message to be recognized;
storing the extracted information for that message such that it is accessible by the entity;
for each further message processed by the entity, comparing a currently processed message against the stored information for a set of messages that have been previously processed by the entity; and
if the stored information for a previously processed message is sufficiently similar to the information extracted from the currently processed message, preventing the currently processed message from triggering the generation of a new message, wherein the step of extracting includes a preliminary step of identifying at least one portion of the message as being a special block, and wherein the step of extracting treats the special block as being an indivisible unit when extracting the information.
-
-
11. A method of preventing maelstroms in computers and computer networks populated by entities capable of outputting messages automatically in response to incoming messages, comprising steps of:
-
for each message processed by an entity that may potentially be forwarded to another entity, extracting information from the message for permitting that message or a similar message to be recognized;
storing the extracted information for that message such that it is accessible by the entity;
for each further message processed by the entity, comparing a currently processed message against the stored information for a set of messages that have been previously processed by the entity; and
if the stored information for a previously processed message is sufficiently similar to the information extracted from the currently processed message, preventing the currently processed message from triggering the generation of a new message, wherein the step of extracting information includes a preliminary step of filtering all or a part of the message to generate a filtered message, and further includes a preliminary step of identifying at least one portion of the message as being a special block, wherein the step of extracting information includes a step of extracting at least one signature pattern comprised of at least one byte sequence from the filtered message, and further treats the special block as being an indivisible unit when extracting the at least one signature pattern.
-
-
12. A method for preventing an occurrence of a maelstrom in a computer network populated by entities capable of generating an outgoing message automatically in response to an incoming message, comprising steps of:
-
providing a signature database for storing entries, each entry corresponding to a message processed by the entity and containing data for enabling a subsequent instance of a same message or a similar message to be identified;
for each message processed by the entity, extracting a signature from the message and comparing the extracted signature to the stored entries to determine if a match occurs;
if a match occurs, updating the matching stored entry and making a determination as to whether to generate a new signature database entry;
if not, determining whether to continue processing the message or not to continue processing the message;
if processing is to continue, transferring the message to a next message processing stage, else if processing is to terminate, terminating the processing of the message;
if the determination as to whether to generate a new signature database entry is to generate the entry, generating the new signature database entry by first extracting the new entry from all or a portion of the message, storing the extracted new signature database entry, and then determining whether to continue processing the message or not to continue processing the message;
else if a match does not occur, generating a new signature database entry by extracting the new entry from the message, storing the extracted new signature database entry, and then determining whether to continue processing the message or not to continue processing the message. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29)
storing a corpus of messages that are representative of messages processed by the entity;
providing all or a portion of the message, the message having at least one portion comprised of a sequence of bytes that are likely to remain substantially invariant from a first instance of the message to a second instance of the message;
selecting at least one candidate signature pattern of the message from the sequence of bytes and constructing a list of unique n-grams from the sequence of bytes;
for each of the unique n-grams, estimating a probability of an occurrence of the unique n-gram within sequences of bytes obtained from the stored corpus of messages;
for each candidate signature pattern that is comprised of one or more of the unique n-grams, estimating a false-positive probability of an occurrence of the candidate signature pattern within the sequences of bytes obtained form the corpus of messages; and
comparing the estimated false-positive probabilities of the candidate signature patterns with one another and with a set of threshold probabilities, the threshold probabilities having values selected to reduce a likelihood of an occurrence of a false-positive indication during the use of any signature pattern with a false-positive probability less than the threshold.
-
-
18. A method as in claim 17, wherein the step of comparing is comprised of steps of:
-
discarding any candidate signature patterns for which an occurrence of a predetermined number of selected bytes is more common than a predetermined threshold;
evaluating an exact-match probability for remaining candidate signature patterns;
discarding any candidate signature patterns having an exact match false-positive probability that is above an exact match threshold;
retaining candidate signature patterns having the lowest estimated probabilities;
for each remaining candidate signature pattern i, evaluating an m-mismatch false-positive probability, starting with m=1, and incrementing m until the false positive probability exceeds an m-mismatch threshold, setting Mi=m−
1;
for all candidate signature patterns that correspond to a particular message, selecting as one or more best signature patterns those having a largest value of M; and
storing each of the one or more selected best signature patterns for each message as an entry in the signature database for subsequent use in identifying an instance of the message or a modified version of the message.
-
-
19. The method as in claim 18, wherein the selection of the one or more best signature patterns is biased to favor signature patterns that are located in different parts of the message.
-
20. The method as in claim 18, wherein the selection of the one or more best signature patterns comprises steps of:
-
(a) an initial step of defining an allowed region of the message body from which signature patterns are to be extracted;
(b) selecting a best signature pattern from the allowed region;
(c) excluding the selected signature pattern and n bytes on either side of the selected signature pattern from further consideration; and
repeating steps b and c until either no further signature patterns can be obtained from the allowed region, or until a maximum number of signature patterns is obtained.
-
-
21. The method as in claim 12, wherein the steps of extracting include a preliminary step of transforming all or a portion of the message to an invariant form.
-
22. A method as in claim 12, wherein the further processing of the message includes a step of forwarding the message or a processed form of the message to at least one recipient.
-
23. The method as in claim 12, and comprising a preliminary step of filtering a message prior to the step of extracting so as to remove insignificant variations between messages.
-
24. The method as in claim 23, wherein the step of filtering is comprised of at least one of removing message header data;
- removing multiple consecutive whitespace characters;
removing non-alphanumeric characters;
or mapping all characters to the same case.
- removing multiple consecutive whitespace characters;
-
25. The method as in claim 12, wherein the step of comparing comprises a step of recognizing textual elements that are likely to indicate a prior forwarding of the message.
-
26. The method as in claim 12, wherein the step of extracting includes a preliminary step of transforming all or a portion of the message to an invariant form, and wherein the step of transforming to an invariant form includes a step of identifying at least one of an inclusion, attachment or non-textual data within the message.
-
27. A method as in claim 12, wherein the step of extracting includes a preliminary step of identifying at least one portion of the message as being a special block, and wherein the step of extracting treats the special block as being an indivisible unit when extracting the information.
-
28. The method as in claim 12, wherein the step of extracting information includes a preliminary step of filtering all or a part of the message to generate a filtered message, and wherein the step of extracting includes a step of extracting at least one signature pattern comprised of at least one byte sequence from the filtered message.
-
29. The method as in claim 12, wherein the step of extracting includes a preliminary step of filtering all or a part of the message to generate a filtered message, and further includes a preliminary step of identifying at least one portion of the message as being a special block, wherein the step of extracting includes a step of extracting at least one signature pattern comprised of at least one byte sequence from the filtered message, and further treats the special block as being an indivisible unit when extracting the at least one signature pattern.
-
30. A digital data processing system comprising interconnected entities capable of outputting messages automatically in response to incoming messages, said system comprising, in at least one of said entities, a subsystem for preventing an occurrence of a maelstrom, comprising:
-
a unit for extracting information from messages processed by the entity, the messages being those that have the potential to trigger an occurrence of a maelstrom, the extracted information being chosen so as to minimize a likelihood that a different message would also contain the extracted information;
a memory for storing the extracted information in a database of extracted information, said database having the extracted information for each message stored in an entry associated with the message; and
a unit for comparing each message processed by the entity against the database entries and, if an entry is found to match exactly to the processed message, for preventing triggering the outputting of a new message, and if an entry is found to similarly match, but not exactly match, the processed message, for preventing triggering the outputting of a new message. - View Dependent Claims (31, 32, 33, 34, 35, 36)
-
-
37. A digital data processing system comprising interconnected entities capable of outputting messages automatically in response to incoming messages, said system comprising, in at least one of said entities, a subsystem for preventing an occurrence of a maelstrom, comprising:
-
a unit for extracting information from messages processed by the entity, the extracted information permitting that message or a similar message to be recognized;
a memory for storing the extracted information in a database of extracted information, said database having the extracted information for each message stored in an entry associated with the message; and
a unit for comparing each message processed by the entity against the database entries and, if an entry is sufficiently similar to the processed message, for preventing triggering the outputting of a new message, wherein said extracting unit operates to first filter all or a part of the message to generate a filtered message, and further operates to identify at least one portion of the message as being a special block, and wherein said extracting unit extracts at least one signature pattern comprised of at least one byte sequence from the filtered message, and treats the identified special block as being an indivisible unit when extracting the at least one signature pattern.
-
-
38. A computer program embodied on a computer-readable medium for providing a subsystem to prevent an occurrence of a maelstrom, comprising:
-
an information extracting segment for extracting information from messages processed by an entity, the messages being those that have the potential to trigger an occurrence of a maelstrom, the extracted information being chosen so as to minimize a likelihood that a different message would also contain the extracted information;
a storage segment for storing the extracted information in a database of extracted information, said database having the extracted information for each message stored in an entry associated with the message; and
a comparison segment for comparing messages processed by the entity against the database entries stored in the storage segment and, if an entry is found to match exactly to the message, for preventing triggering the generation and forwarding of a new message; and
if an entry is found to similarly match, but not exactly match, to the message, for preventing triggering the generation and forwarding of a new message.
-
Specification