Location-based analytic platform and methods
First Claim
1. A method of learning an audience member function, the method comprising:
- obtaining, with one or more processors, a training set of geographic data describing geolocation histories of a plurality of mobile devices, wherein members of the training set are classified according to whether the respective member of the training set is a member of an audience, and whereineach geolocation history corresponds to a different user or computing device selected from among a set of more than 100,000 geolocation histories; and
at least some geolocation histories each comprise a respective plurality of timestamped geolocations collected over more than a week;
retrieving, with one or more processors, attributes of geolocations in the geolocation histories from a geographic information system, wherein, for at least some geolocations in the geolocation histories, a plurality of attributes are retrieved for respective geolocations, the attributes each indicating a propensity of users to exhibit a different respective behavior described by the respective attribute in a respective geolocation;
learning, with one or more processors, feature functions of an audience member function based on the training set, wherein at least some of the feature functions are a function of the retrieved attributes of geolocation, wherein the feature functions are learned, at least in part, by calculating a plurality of impurity measures for candidate feature functions and selecting one of the candidate feature functions based on the relative values of the impurity measures, and wherein;
the audience member function is configured to output a score indicative of a probability that a given user is in, or classification of the given user in, the audience;
the audience member function is configured to output the score based on a given input vector, corresponding to the given user;
the given input vector is based on a given geolocation history of the user and has a plurality of dimensions, at least some of the plurality of dimensions being based on at least some of the plurality of attributes; and
the feature functions are learned, at least in part, by performing steps comprising;
selecting a subset of the training set that has a selected dimension larger than a threshold value;
for each of a plurality of other dimensions, and for each of a plurality of values of each of the plurality of other dimensions, calculating an impurity measure corresponding to respective value in the respective other dimension; and
selecting another dimension and a value based on the smallest impurity measure among the calculated impurity measures; and
storing, with one or more processors, the feature functions of the audience member function in an audience repository.
7 Assignments
0 Petitions
Accused Products
Abstract
Provided is a process of learning an audience member function, the process including: obtaining a training set of geographic data describing geolocation histories of a plurality of mobile devices, wherein members of the training set are classified according to whether the respective member of the training set is a member of an audience; retrieving attributes of geolocations in the geolocation histories from a geographic information system; learning feature functions of an audience member function based on the training set, wherein at least some of the feature functions are a function of the retrieved attributes of geolocation, wherein the feature functions are learned, at least in part, by calculating a plurality of impurity measures for candidate feature functions and selecting one of the candidate feature functions based on the relative values of the impurity measures; and storing the feature functions of the audience member function in an audience repository.
34 Citations
24 Claims
-
1. A method of learning an audience member function, the method comprising:
-
obtaining, with one or more processors, a training set of geographic data describing geolocation histories of a plurality of mobile devices, wherein members of the training set are classified according to whether the respective member of the training set is a member of an audience, and wherein each geolocation history corresponds to a different user or computing device selected from among a set of more than 100,000 geolocation histories; and at least some geolocation histories each comprise a respective plurality of timestamped geolocations collected over more than a week; retrieving, with one or more processors, attributes of geolocations in the geolocation histories from a geographic information system, wherein, for at least some geolocations in the geolocation histories, a plurality of attributes are retrieved for respective geolocations, the attributes each indicating a propensity of users to exhibit a different respective behavior described by the respective attribute in a respective geolocation; learning, with one or more processors, feature functions of an audience member function based on the training set, wherein at least some of the feature functions are a function of the retrieved attributes of geolocation, wherein the feature functions are learned, at least in part, by calculating a plurality of impurity measures for candidate feature functions and selecting one of the candidate feature functions based on the relative values of the impurity measures, and wherein; the audience member function is configured to output a score indicative of a probability that a given user is in, or classification of the given user in, the audience; the audience member function is configured to output the score based on a given input vector, corresponding to the given user; the given input vector is based on a given geolocation history of the user and has a plurality of dimensions, at least some of the plurality of dimensions being based on at least some of the plurality of attributes; and the feature functions are learned, at least in part, by performing steps comprising; selecting a subset of the training set that has a selected dimension larger than a threshold value; for each of a plurality of other dimensions, and for each of a plurality of values of each of the plurality of other dimensions, calculating an impurity measure corresponding to respective value in the respective other dimension; and selecting another dimension and a value based on the smallest impurity measure among the calculated impurity measures; and storing, with one or more processors, the feature functions of the audience member function in an audience repository. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A tangible, non-transitory, machine-readable medium storing instructions that when executed by one or more processors effectuate operations comprising:
-
obtaining, with one or more processors, a training set of geographic data describing geolocation histories of a plurality of mobile devices, wherein members of the training set are classified according to whether the respective member of the training set is a member of an audience, and wherein each geolocation history corresponds to a different user or computing device selected from among a set of more than 100,000 geolocation histories; and at least some geolocation histories each comprise a respective plurality of timestamped geolocations collected over more than a week; retrieving, with one or more processors, attributes of geolocations in the geolocation histories from a geographic information system, wherein, for at least some geolocations in the geolocation histories, a plurality of attributes are retrieved for respective geolocations, the attributes each indicating a propensity of users to exhibit a different respective behavior described by the respective attribute in a respective geolocation; learning, with one or more processors, feature functions of an audience member function based on the training set, wherein at least some of the feature functions are a function of the retrieved attributes of geolocation, wherein the feature functions are learned, at least in part, by calculating a plurality of impurity measures for candidate feature functions and selecting one of the candidate feature functions based on the relative values of the impurity measures, and wherein; the audience member function is configured to output a score indicative of a probability that a given user is in, or classification of the given user in, the audience; the audience member function is configured to output the score based on a given input vector, corresponding to the given user; the given input vector is based on a given geolocation history of the user and has a plurality of dimensions, at least some of the plurality of dimensions being based on at least some of the plurality of attributes; and the feature functions are learned, at least in part, by performing steps comprising; selecting a subset of the training set that has a selected dimension larger than a threshold value; for each of a plurality of other dimensions, and for each of a plurality of values of each of the plurality of other dimensions, calculating an impurity measure corresponding to respective value in the respective other dimension; and selecting another dimension and a value based on the smallest impurity measure among the calculated impurity measures; and storing, with one or more processors, the feature functions of the audience member function in an audience repository. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
Specification