×

System And Method for Processing A Message Store For Near Duplicate Messages

  • US 20090307630A1
  • Filed: 08/17/2009
  • Published: 12/10/2009
  • Est. Priority Date: 03/19/2001
  • Status: Active Grant
First Claim
Patent Images

1. A system for processing a message store for near duplicate messages, comprising:

  • a deduper module configured to identify near duplicate messages in a message store, comprising;

    a comparer module configured to compare compound digests taken of metadata for, of content contained in, and of each attachment associated with each of the messages in the message store; and

    a marker module configured to mark each such message having a compound digest not matching the compound digest of any other such message as unique; and

    mark each such message having a compound digest matching the compound digest of at least one other such message as an exact duplicate; and

    a classifier module configured to group those messages remaining unmarked and having similar content into sets that each comprise one or more near duplicate messages, wherein the marker is further configured to designate a first of the near duplicate messages in each of the sets as unique and each remaining near duplicate message in the set as a near duplicate; and

    a processor to execute each of the modules, which are stored on a computer-readable storage medium.

View all claims
  • 10 Assignments
Timeline View
Assignment View
    ×
    ×