Lightweight content filtering system for mobile phones
First Claim
1. A method of determining a classification of an incoming text message in a mobile phone, the method comprising:
- loading a feature list into a main memory of the mobile phone, the feature list being a subset of a dictionary used to parse training text messages that were used to train a content filtering system to classify text messages in a computer separate from the mobile phone;
parsing the incoming text message in the mobile phone for occurrence of feature words included in the feature list;
converting the incoming text message into an input vector in the mobile phone, the input vector comprising numeric representations of feature words identified as occurring in the incoming text message, the feature words identified as occurring in the incoming text message having associated statistical values that are used as a parameter in converting the feature words into the input vector in the mobile phone, the statistical values being pre-computed in a computer separate from the mobile phone;
unloading the feature list from the main memory of the mobile phone;
loading a learning model into the main memory of the mobile phone, the learning model being generated during training of the content filtering system;
determining a classification of the incoming text message in the mobile phone by determining a similarity of the input vector to a first representative vector of the learning model.
1 Assignment
0 Petitions
Accused Products
Abstract
In one embodiment, a content filtering system includes a feature list and a learning model. The feature list may be a subset of a dictionary that was used to train the content filtering system to identify classification (e.g., spam, phishing, porn, legitimate text messages, etc.) of text messages during a training stage. The learning model may include representative vectors, each of which represents a particular class of text messages. The learning model and the feature list may be generated in a server computer during the training stage and then subsequently provided to the mobile phone. An incoming text message in the mobile phone may be parsed for occurrences of feature words included in the feature list and then converted to an input vector. The input vector may be compared to the learning model to determine the classification of the incoming text message.
57 Citations
16 Claims
-
1. A method of determining a classification of an incoming text message in a mobile phone, the method comprising:
-
loading a feature list into a main memory of the mobile phone, the feature list being a subset of a dictionary used to parse training text messages that were used to train a content filtering system to classify text messages in a computer separate from the mobile phone; parsing the incoming text message in the mobile phone for occurrence of feature words included in the feature list; converting the incoming text message into an input vector in the mobile phone, the input vector comprising numeric representations of feature words identified as occurring in the incoming text message, the feature words identified as occurring in the incoming text message having associated statistical values that are used as a parameter in converting the feature words into the input vector in the mobile phone, the statistical values being pre-computed in a computer separate from the mobile phone; unloading the feature list from the main memory of the mobile phone; loading a learning model into the main memory of the mobile phone, the learning model being generated during training of the content filtering system; determining a classification of the incoming text message in the mobile phone by determining a similarity of the input vector to a first representative vector of the learning model. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A content filtering system comprising:
-
a server computer comprising a dictionary that comprises a plurality of dictionary words, a word segmentation module configured to detect occurrences of dictionary words in a plurality of training text messages, a feature selection module configured to select feature words from among dictionary words occurring in the plurality of training text messages and to generate a feature list that is a subset of the dictionary and comprising the feature words, a first conversion module configured to convert the plurality of training text messages into vectors having numeric representations of selected words as items, and a training module configured generate a learning model; and a mobile phone configured to receive the feature list and the learning model, the mobile phone comprising a second conversion module configured to detect occurrences of feature words from the feature list in an incoming text message and to convert the incoming text message into an input vector having numeric representations of feature words occurring in the incoming text message as items, the feature words occurring in the incoming text message having associated statistical values that are used as a parameter in converting the incoming text message into the input vector, the statistical values being pre-computed prior to being received in the mobile phone, and a prediction module configured to determine a classification of the incoming text message based on a comparison of the input vector to the learning model. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A method of determining a classification of a text message, the method comprising:
-
wirelessly receiving an incoming text message in a resource limited device; parsing the incoming text message in the resource limited device for occurrence of words included in a listing of words, the listing of words being a subset of a dictionary used to generate a learning model in a computer separate from the resource limited device, the listing of words but not the dictionary being available in the resource limited device; converting the incoming text message into an input vector in the resource limited device, the input vector comprising numeric representations of words occurring in the incoming text message, the words occurring in the incoming message having associated statistical values that are used as a parameter in converting the incoming text message into the input vector in the resource limited device; and determining a classification of the incoming text message in the resource limited device by comparing the input vector to the learning model. - View Dependent Claims (15, 16)
-
Specification