×

SPAM FILTERING BASED ON STATISTICS AND TOKEN FREQUENCY MODELING

  • US 20100145900A1
  • Filed: 12/04/2008
  • Published: 06/10/2010
  • Est. Priority Date: 12/04/2008
  • Status: Active Grant
First Claim
Patent Images

1. A network device, comprising:

  • a transceiver to send and receive data over a network; and

    a processor that is operative to perform actions, comprising;

    receiving a message;

    determining a plurality of tokens from the received message based in part on a text body of the received message;

    analyzing the plurality of tokens to assign probability values that the received message is classifiable as one of a plurality of message classes, including a spam message and a non-spam message;

    selecting a message class for the received message based on a comparison of the assigned probability values, wherein a probability value is associated with each of the plurality of message classes;

    providing the message class selected, a list of tokens with associated token frequencies, and the plurality of tokens to a token frequency component that is configured for the selected message class, wherein the list of tokens are determined for the message class selected; and

    using the token frequency component to determine a number of tokens in the plurality of tokens that result in an associated token frequency for each matching token in the list of tokens exceeding a token frequency threshold; and

    based on a comparison between the number of tokens exceeding the token frequency threshold to a matched token threshold identifying the received message as a spam message or a non-spam message.

View all claims
  • 6 Assignments
Timeline View
Assignment View
    ×
    ×