×

System and method for identifying and categorizing messages extracted from archived message stores

  • US 20060190493A1
  • Filed: 04/24/2006
  • Published: 08/24/2006
  • Est. Priority Date: 03/19/2001
  • Status: Active Grant
First Claim
Patent Images

1. A system for identifying messages in a message store, comprising:

  • a digester to encode at least part of metadata associated with and at least part of content contained in each of a plurality of messages in a message store by generating a metadata sequence and a content sequence for each message; and

    a comparer to group the messages into sets by similar metadata sequences and similar content sequences and to compare the messages in each set, comprising;

    a unique marker to mark each such message not matching any other such message in the set as a unique message;

    an exact duplicate marker to mark each such message matching at least one other such message in the set as an exact duplicate message; and

    a near duplicate marker to mark each such message comprising a subset of at least one other such message in the set as a near duplicate message.

View all claims
  • 12 Assignments
Timeline View
Assignment View
    ×
    ×