Lightweight SVM-based content filtering system for mobile phones

US 8,023,974 B1
Filed: 02/15/2007
Issued: 09/20/2011
Est. Priority Date: 02/15/2007
Status: Expired due to Fees

First Claim

Patent Images

1. A method of classifying text messages in a mobile phone, the method comprising:

training a support vector machine using a plurality of sample spam text messages and a plurality of sample legitimate text messages in a server computer separate from the mobile phone during a training stage to generate an intermediate support vector machine learning model that includes a threshold value and support vectors;

deriving the support vector machine (SVM) learning model from the intermediate support vector machine learning model by storing in the SVM learning model the threshold value but not the support vectors from the intermediate support vector machine learning model, a feature set, and score values comprising weights assigned to features in the feature set;

providing the SVM learning model in the mobile phone,extracting features from a text message in the mobile phone to generated extracted features;

retrieving from the SVM learning model a corresponding score value for each of the extracted features;

adding score values of the extracted features to generate a total score; and

comparing the total score to the threshold value to determine whether or not the text message is a spam text message.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In one embodiment, a content filtering system generates a support vector machine (SVM) learning model in a server computer and provides the SVM learning model to a mobile phone for use in classifying text messages. The SVM learning model may be generated in the server computer by training a support vector machine with sample text messages that include spam and legitimate text messages. A resulting intermediate SVM learning model from the support vector machine may include a threshold value, support vectors and alpha values. The SVM learning model in the mobile phone may include the threshold value, the features, and the weights of the features. An incoming text message may be parsed for the features. The weights of features found in the incoming text message may be added and compared to the threshold value to determine whether or not the incoming text message is spam.

Citations

14 Claims

1. A method of classifying text messages in a mobile phone, the method comprising:
- training a support vector machine using a plurality of sample spam text messages and a plurality of sample legitimate text messages in a server computer separate from the mobile phone during a training stage to generate an intermediate support vector machine learning model that includes a threshold value and support vectors;
  
  deriving the support vector machine (SVM) learning model from the intermediate support vector machine learning model by storing in the SVM learning model the threshold value but not the support vectors from the intermediate support vector machine learning model, a feature set, and score values comprising weights assigned to features in the feature set;
  
  providing the SVM learning model in the mobile phone,extracting features from a text message in the mobile phone to generated extracted features;
  
  retrieving from the SVM learning model a corresponding score value for each of the extracted features;
  
  adding score values of the extracted features to generate a total score; and
  
  comparing the total score to the threshold value to determine whether or not the text message is a spam text message.
- View Dependent Claims (2, 3, 4)
- - 2. The method of claim 1 wherein the SVM learning model is generated in the server computer during the training stage and wirelessly provided to the mobile phone.
  - 3. The method of claim 1 wherein the extracted features are not converted to vectors in the mobile phone.
  - 4. The method of claim 1 wherein the text message comprises a Short Message Service (SMS) text message.

5. A mobile phone comprising a memory, a processor configured to run computer-readable program code in the memory, and a file system, the file system comprising:
- a support vector machine (SVM) learning model comprising a threshold value, a feature set, and score values for features in the feature set, the SVM learning model being derived from an intermediate SVM learning model generated in a computer external to the mobile phone by training a support vector machine using a plurality of sample spam text messages and a plurality of sample legitimate text messages, the score values comprising weight values assigned to features in the feature set;
  
  a parser configured to parse a text message in the mobile phone for features noted in the SVM learning model; and
  
  an anti-spam engine configured to determine whether or not the text message is a spam text message based on weights of features noted in the SVM learning model and found in the text message without converting the text message to a vector in the mobile phone.
- View Dependent Claims (6, 7, 8)
- - 6. The mobile phone of claim 5 wherein the anti-spam engine determines whether or not the text message is a spam text message by retrieving from the SVM learning model a weight of each feature found in the text message, adding the weights of all features found in the text message to generate a total score, and comparing the total score to the threshold.
  - 7. The mobile phone of claim 6 wherein the anti-spam engine is configured to deem the text message as a spam text message if the total score exceeds the threshold.
  - 8. The mobile phone of claim 5 wherein the parser is configured to parse the text message by extracting from the text message words and/or phrases having corresponding weights in the SVM learning model.

9. A method of classifying text messages wirelessly received in a mobile phone, the method comprising:
- in a server computer, training a support vector machine using a plurality of sample text messages comprising sample spam text messages and sample legitimate text messages to generate a first support vector machine (SVM) learning model, the first SVM learning model comprising a threshold value, a feature set, and score values for features in the feature set;
  
  providing the first SVM learning model to a mobile phone; and
  
  using the first SVM learning model in the mobile phone to classify a text message in the mobile phone without converting the text message to a vector in the mobile phone.
- View Dependent Claims (10, 11, 12, 13, 14)
- - 10. The method of claim 9 wherein the first SVM learning model in the mobile phone does not include support vectors generated during the training of the support vector machine in the server computer.
  - 11. The method of claim 9 wherein training the support vector machine in the server computer comprises:
    - providing a dictionary and a stop list;
      
      parsing the plurality of sample text messages to identify words in the plurality of sample text messages included in the dictionary to generate a feature list;
      
      removing from the feature list words included in the stop list to generate a revised feature list;
      
      converting the plurality of sample text messages to feature vectors having features corresponding to words included in the revised feature list; and
      
      using the feature vectors to train the support vector machine to generate a second support vector machine (SVM) learning model, the support vector machine employing a linear kernel function, the second SVM learning model including a threshold value, a plurality of support vectors and a set of alpha values; and
      
      deriving the first SVM learning model from the second SVM learning model by including in the first SVM learning model the threshold value from the second SVM learning model, the revised feature list, and score values of features in the revised feature list computed by combining the alpha values and the support vectors in the second SVM learning model.
  - 12. The method of claim 9 wherein using the first SVM learning model in the mobile phone to classify the text message comprises:
    - parsing the text message to extract features from the text message, the extracted features being identified in the first SVM learning model;
      
      consulting the first SVM learning model for score values of the extracted features; and
      
      comparing the score values to the threshold included in the first SVM learning model.
  - 13. The method of claim 12 wherein comparing the score values comprises:
    - adding the score values of the extracted features to generate a total score; and
      
      deeming the text message as a spam text message if the total score exceeds the threshold value.
  - 14. The method of claim 9 wherein the text message comprises an SMS text message.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Trend Micro Inc.
Original Assignee
Trend Micro Inc.
Inventors
Diao, Lili, Lu, Patrick MG, Chan, Vincent
Primary Examiner(s)
Corsaro; Nick
Assistant Examiner(s)
Johnson; Gerald

Application Number

US11/706,539
Time in Patent Office

1,678 Days
Field of Search

709/206
US Class Current

455/466
CPC Class Codes

G06F 18/2411   based on the proximity to a...

H04L 51/212   using filtering or selectiv...

H04W 12/128   Anti-malware arrangements, ...

Lightweight SVM-based content filtering system for mobile phones

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

14 Claims

Specification

Solutions

Use Cases

Quick Links

Lightweight SVM-based content filtering system for mobile phones

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

14 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links