System and method for evaluating a structured message store for message redundancy
First Claim
1. A computer-readable medium encoded with stored message records for redundancy processing, comprising:
- hash codes that are each determined from at least part of a message header plus a message body for each of a plurality of electronic messages; and
metadata to store the electronic messages with matching hash codes into groups and to identify one such electronic message within each group as unique;
further hash codes that are each determined from one or more attachment of at least one such electronic message; and
metadata to identify the electronic messages within each group with matching further hash codes as exact duplicates.
12 Assignments
0 Petitions
Accused Products
Abstract
A system and method for evaluating a structured message store for message redundancy is described. A header and a message body are extracted from each of a plurality of messages maintained in a structured message store. A substantially unique hash code is calculated over at least part of the header and over the message body of each message. The messages are grouped by the hash codes. One such message is identified as a unique message within each group. In a further embodiment, the messages are grouped by conversation thread. The message body for each message within each conversation thread group is compared. At least one such message within each conversation thread group is identified as a unique message.
-
Citations
28 Claims
-
1. A computer-readable medium encoded with stored message records for redundancy processing, comprising:
-
hash codes that are each determined from at least part of a message header plus a message body for each of a plurality of electronic messages; and metadata to store the electronic messages with matching hash codes into groups and to identify one such electronic message within each group as unique; further hash codes that are each determined from one or more attachment of at least one such electronic message; and metadata to identify the electronic messages within each group with matching further hash codes as exact duplicates. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A client system, for storing electronic messages for redundancy processing, comprising:
-
a plurality of electronic messages that each comprise a message header plus a message body, at least part of which is used to determine a hash code for each electronic message; and an interface providing access to the electronic messages, wherein the electronic messages with matching hash codes are assigned into groups and one such electronic message within each group is identified as unique, wherein at least one such electronic message further comprises one or more attachments, at least part of which is used to determine a further hash code and the electronic messages within each group with matching further hash codes are identified as exact duplicates. - View Dependent Claims (7, 8, 9, 10)
-
-
11. An analysis system processing electronic messages for redundancy, comprising:
-
a plurality of electronic messages that each comprise a message header plus a message body wherein at least one such electronic message further comprises one or more attachments; and a processor configured to determine hash codes from at least a part of each electronic message header plus message body and to assign the electronic messages with matching hash codes into groups, wherein one such electronic message within each group is identified as unique and to determine further hash codes from each attachment, the electronic messages within each group with matching further hash codes are identified as exact duplicates. - View Dependent Claims (12, 13, 14, 15, 16, 17)
-
-
18. A computer-readable medium encoded with structured records storing electronic messages for redundancy processing, comprising:
-
structured data for a plurality of electronic messages, comprising at least one of; a group of electronic messages with matching hash codes; and a unique electronic message identified within each group; and wherein the hash codes are determined from at least part of a message header plus a message body for each electronic message and the structured data further comprises one or more attachments with further hash codes for each attachment, wherein the electronic messages within each group with matching further hash codes are identified as exact duplicates. - View Dependent Claims (19, 20)
-
-
21. A process for processing electronic messages for redundancy, comprising:
-
accessing a plurality of electronic messages that each comprise a message header plus a message body, wherein at least one such electronic message further comprises one or more attachments; determining hash codes from at least a part of each message header plus message body; assigning the electronic messages with matching hash codes into groups and identifying one such electronic message within each group as unique; determining further hash codes from each attachment; and identifying the electronic messages within each group with matching further hash codes as exact duplicates. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28)
-
Specification