Generalized engine for predicting actions
First Claim
1. A prediction computer system comprising one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising:
- receiving, by the prediction computer system, a query requesting one or more predicted activities that are likely to be performed by users having a particular attribute;
determining, by the prediction computer system, a plurality of matching sessions from user session data of a plurality of users, each matching session of the plurality of matching sessions being a user session for a user having the particular attribute of the query, wherein each user session for each user of the plurality of users includes data representing one or more user activities performed by the user during a particular time period,wherein the user session data is partitioned into a plurality of shards, each shard of the plurality of shards being stored on one of multiple index servers of the prediction computer system, wherein the index servers include a root server and a plurality of leaf servers;
computing, by the prediction computer system for each activity of a plurality of activities represented in the matching sessions, a respective lower bound of a number of distinct users having data contributing to the activity in the matching sessions for the activity,wherein computing a lower bound for an activity comprises;
hashing, by each leaf server for each matching session, a portion of a respective user identifier associated with the matching session to generate one or more user key positions for the matching session,generating, by each leaf server for each activity of one or more activities occurring in the matching sessions stored in the shard assigned to the leaf server, a respective merged user key for the activity including setting each position in the merged user key indicated by any user key position generated for matching sessions having the activity,generating, by the root server for each activity of one or more activities occurring in the matching sessions, an overall merged user key for the activity including setting each position in the overall merged user key indicated by any set position in any merged user key received from the leaf servers for matching sessions having the activity, andcomputing, by the root server for each activity of one or more activities occurring in the matching sessions, a count of positions that are set in the overall merged user key for the activity;
computing, by the prediction computer system, for each user activity of one or more activities having a lower bound that satisfies a threshold;
a respective first score representing a likelihood that the user activity occurs in the plurality of matching sessions;
a respective second score representing a likelihood that the user activity occurs in any of the user sessions of the user session data;
a respective third score that measures a relative magnitude of the first score compared to the second score;
designating, as the one or more predicted activities, one or more of the user activities having a third score that satisfies a threshold; and
providing, by the prediction computer system and based on the designation of the one or more user activities as the one or more predicted activities, data representing a respective predicted activity for each of the one or more predicted activities.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for predicting actions based on large-scale aggregations of data. One of the methods includes obtaining user activity data organized into sessions, the user activity data representing user activities, each session including one or more user activities for a particular user, the sessions including sessions for multiple users; receiving a session query, the session query including a query term representing a query activity; identifying matching sessions, the matching sessions each satisfying the session query; identifying likely activities in the matching sessions, likely activities being activities found in the matching sessions that satisfy the session query and occur in the matching sessions more frequently than in sessions in general; and identifying one or more of the likely activities in a response to the session query.
-
Citations
18 Claims
-
1. A prediction computer system comprising one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising:
-
receiving, by the prediction computer system, a query requesting one or more predicted activities that are likely to be performed by users having a particular attribute; determining, by the prediction computer system, a plurality of matching sessions from user session data of a plurality of users, each matching session of the plurality of matching sessions being a user session for a user having the particular attribute of the query, wherein each user session for each user of the plurality of users includes data representing one or more user activities performed by the user during a particular time period, wherein the user session data is partitioned into a plurality of shards, each shard of the plurality of shards being stored on one of multiple index servers of the prediction computer system, wherein the index servers include a root server and a plurality of leaf servers; computing, by the prediction computer system for each activity of a plurality of activities represented in the matching sessions, a respective lower bound of a number of distinct users having data contributing to the activity in the matching sessions for the activity, wherein computing a lower bound for an activity comprises; hashing, by each leaf server for each matching session, a portion of a respective user identifier associated with the matching session to generate one or more user key positions for the matching session, generating, by each leaf server for each activity of one or more activities occurring in the matching sessions stored in the shard assigned to the leaf server, a respective merged user key for the activity including setting each position in the merged user key indicated by any user key position generated for matching sessions having the activity, generating, by the root server for each activity of one or more activities occurring in the matching sessions, an overall merged user key for the activity including setting each position in the overall merged user key indicated by any set position in any merged user key received from the leaf servers for matching sessions having the activity, and computing, by the root server for each activity of one or more activities occurring in the matching sessions, a count of positions that are set in the overall merged user key for the activity; computing, by the prediction computer system, for each user activity of one or more activities having a lower bound that satisfies a threshold; a respective first score representing a likelihood that the user activity occurs in the plurality of matching sessions; a respective second score representing a likelihood that the user activity occurs in any of the user sessions of the user session data; a respective third score that measures a relative magnitude of the first score compared to the second score; designating, as the one or more predicted activities, one or more of the user activities having a third score that satisfies a threshold; and providing, by the prediction computer system and based on the designation of the one or more user activities as the one or more predicted activities, data representing a respective predicted activity for each of the one or more predicted activities. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer-implemented method comprising:
-
receiving, by a prediction computer system, a query requesting predicted activities that are likely to be performed by users having a particular attribute; determining, by the prediction computer system, a plurality of matching sessions from user session data of a plurality of users, each matching session of the plurality of matching sessions being a user session for a user having the particular attribute of the query, wherein each user session for each user of the plurality of users includes data representing one or more user activities performed by the user during a particular time period, wherein the user session data is partitioned into a plurality of shards, each shard of the plurality of shards being stored on one of multiple index servers of the prediction computer system, wherein the index servers include a root server and a plurality of leaf servers; computing, by the prediction computer system for each activity of a plurality of activities represented in the matching sessions, a respective lower bound of a number of distinct users having data contributing to the activity in the matching sessions for the activity, wherein computing a lower bound for an activity comprises; hashing, by each leaf server for each matching session, a portion of a respective user identifier associated with the matching session to generate one or more user key positions for the matching session, generating, by each leaf server for each activity of one or more activities occurring in the matching sessions stored in the shard assigned to the leaf server, a respective merged user key for the activity including setting each position in the merged user key indicated by any user key position generated for matching sessions having the activity, generating, by the root server for each activity of one or more activities occurring in the matching sessions, an overall merged user key for the activity including setting each position in the overall merged user key indicated by any set position in any merged user key received from the leaf servers for matching sessions having the activity, and computing by the root server for each activity of one or more activities occurring in the matching sessions, a count of positions that are set in the overall merged user key for the activity; computing, by the prediction computer system, for each user activity of one or more activities having a lower bound that satisfies a threshold; a respective first score representing a likelihood that the user activity occurs in the plurality of matching sessions; a respective second score representing a likelihood that the user activity occurs in any of the user sessions of the user session data; a respective third score that measures a relative magnitude of the first score compared to the second score; designating, as the one or more predicted activities, one or more of the user activities having a third score that satisfies a threshold; and providing, by the prediction computer system and based on the designation of the one or more user activities as the one or more predicted activities, data representing a respective predicted activity for each of the one or more predicted activities. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer program product, encoded on one or more non-transitory computer storage media, comprising instructions that when executed by one or more computers of a prediction computer system cause the one or more computers to perform operations comprising:
-
receiving, by the prediction computer system, a query requesting predicted activities that are likely to be performed by users having a particular attribute; determining, by the prediction computer system, a plurality of matching sessions from user session data of a plurality of users, each matching session of the plurality of matching sessions being a user session for a user having the particular attribute of the query, wherein each user session for each user of the plurality of users includes data representing one or more user activities performed by the user during a particular time period, wherein the user session data is partitioned into a plurality of shards, each shard of the plurality of shards being stored on one of multiple index servers of the prediction computer system, wherein the index servers include a root server and a plurality of leaf servers; computing, by the prediction computer system for each activity of a plurality of activities represented in the matching sessions, a respective lower bound of a number of distinct users having data contributing to the activity in the matching sessions for the activity, wherein computing a lower bound for an activity comprises; hashing, by each leaf server for each matching session, a portion of a respective user identifier associated with the matching session to generate one or more user key positions for the matching session, generating, by each leaf server for each activity of one or more activities occurring in the matching sessions stored in the shard assigned to the leaf server, a respective merged user key for the activity including setting each position in the merged user key indicated by any user key position generated for matching sessions having the activity, generating, by the root server for each activity of one or more activities occurring in the matching sessions, an overall merged user key for the activity including setting each position in the overall merged user key indicated by any set position in any merged user key received from the leaf servers for matching sessions having the activity, and computing by the root server for each activity of one or more activities occurring in the matching sessions, a count of positions that are set in the overall merged user key for the activity; computing, by the prediction computer system, for each user activity of one or more activities having a lower bound that satisfies a threshold; a respective first score representing a likelihood that the user activity occurs in the plurality of matching sessions; a respective second score representing a likelihood that the user activity occurs in any of the user sessions of the user session data; a respective third score that measures a relative magnitude of the first score compared to the second score; designating, as the one or more predicted activities, one or more of the user activities having a third score that satisfies a threshold; and providing, by the prediction computer system and based on the designation of the one or more user activities as the one or more predicted activities, data representing a respective predicted activity for each of the one or more predicted activities. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification