System And Method For Identifying Unique And Duplicate Messages
First Claim
Patent Images
1. A system for identifying unique and duplicate messages, comprising:
- a database of messages;
an extractor module to extract a header and a message body from each message;
a parser module to calculate a hash code for each message over at least part of the header and the body of that message and to group the messages having matching hash codes;
a deduper module to randomly select one message in each group with two or more messages as a unique message and to mark the remaining messages in the group as exact duplicate messages; and
a processor to execute the modules.
8 Assignments
0 Petitions
Accused Products
Abstract
A system and method for identifying unique and duplicate messages is provided. Messages are maintained, and a header and message body are extracted from each of the messages. A hash code is calculated for each message over at least part of the header and the body of that message. The messages with matching hash codes are grouped. One message in each group with two or more messages is randomly selected as a unique message. The remaining messages in the group are marked as exact duplicate messages.
23 Citations
20 Claims
-
1. A system for identifying unique and duplicate messages, comprising:
-
a database of messages; an extractor module to extract a header and a message body from each message; a parser module to calculate a hash code for each message over at least part of the header and the body of that message and to group the messages having matching hash codes; a deduper module to randomly select one message in each group with two or more messages as a unique message and to mark the remaining messages in the group as exact duplicate messages; and a processor to execute the modules. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method for identifying unique and duplicate messages, comprising:
-
maintaining messages; extracting a header and a message body from each message; calculating a hash code for each message over at least part of the header and the body of that message; grouping the messages having matching hash codes; randomly selecting one message in each group with two or more messages as a unique message; and marking the remaining messages in the group as exact duplicate messages. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification