Method for inferring behavioral characteristics based on a large volume of data
First Claim
Patent Images
1. A method for inferring a behavioral characteristic of an entity from a large volume of multi-entity transaction data, comprising:
- extracting N ordered pairs from a telephone number database, the N ordered pairs having a telephone number and a business status value indicating whether the telephone number belongs to a business;
storing transaction data including a plurality of call detail records, each of said records having an originating telephone number, a dialed telephone number, a connect time and a duration;
extracting a first sequence of transactions corresponding to the N ordered pair telephone numbers from the transaction data;
identifying a plurality of features indicative of the business status value within the first sequence of transactions;
building a model to predict the business status value from the features;
extracting a second sequence of transactions corresponding to a telephone number of an entity from the transaction data;
predicting a business status value for the entity using the model and the second sequence of transactions; and
inferring whether the entity is a business from the predicted business status value.
1 Assignment
0 Petitions
Accused Products
Abstract
A method provides for mining information from large volumes of data regarding transactions. The method provides for inferring a behavioral characteristic of a party to the transaction based on a large volume of data concerning a multitude of parties. That inferred characteristic may be dynamic in nature.
-
Citations
14 Claims
-
1. A method for inferring a behavioral characteristic of an entity from a large volume of multi-entity transaction data, comprising:
-
extracting N ordered pairs from a telephone number database, the N ordered pairs having a telephone number and a business status value indicating whether the telephone number belongs to a business;
storing transaction data including a plurality of call detail records, each of said records having an originating telephone number, a dialed telephone number, a connect time and a duration;
extracting a first sequence of transactions corresponding to the N ordered pair telephone numbers from the transaction data;
identifying a plurality of features indicative of the business status value within the first sequence of transactions;
building a model to predict the business status value from the features;
extracting a second sequence of transactions corresponding to a telephone number of an entity from the transaction data;
predicting a business status value for the entity using the model and the second sequence of transactions; and
inferring whether the entity is a business from the predicted business status value. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
analyzing a subsequent set of call detail records using the model;
determining a revised probability that the entity is a business based on the analysis of said subsequent set of call detail records and an earlier determined probability.
-
-
3. The method of claim 2, wherein said building the model includes processing staging and call aggregation.
-
4. The method of claim 3, wherein said processing staging occurs over a 24 hour period.
-
5. The method of claim 1, further comprising:
forming a calling profile for each originating telephone number by binning the call detail records associated with the originating telephone number into a four-dimensional data array.
-
6. The method of claim 5, wherein said data array includes a day-of-week dimension, a time-of-day dimension, a duration dimension, and a status dimension.
-
7. The method of claim 1, wherein said model is formed using logistic regression techniques.
-
8. The method of claim 7, wherein said model is regularized using a ridge penalty.
-
9. The method of claim 1, wherein said building the model occurs over an update period.
-
10. The method of claim 9, wherein said update period is one day.
-
11. The method of claim 2, wherein said determining a revised probability includes updating the earlier determined probability based on exponential weighting and an aging factor.
-
12. The method of claim 1, wherein said model is based on linear regression.
-
13. The method of claim 1, wherein said model is based on decision trees.
-
14. The method of claim 1, wherein said model is based on neural nets.
Specification