Search results based on user biases on online social networks
First Claim
1. A method comprising, by a computing device:
- accessing, by the computing device, a social graph comprising a plurality of nodes and a plurality of edges connecting the nodes, each of the edges between two of the nodes representing a single degree of separation between them;
receiving, at the computing device from a client device of a first user, a structured query comprising references to one or more selected nodes from the plurality of nodes and one or more selected edges from the plurality of edges;
identifying, by the computing device, one or more nodes of a plurality of second nodes corresponding to the structured query, wherein the plurality of second nodes correspond to an overall population of users associated with an online social network;
calculating, by the computing device, a score for each of the identified nodes, wherein the score is calculated using a probabilistic ranking model that scores each identified node based at least in part on (1) a number of edges connecting the identified node to one or more nodes within a first set of user nodes, the first set of user nodes comprising a first node corresponding to the first user and a plurality of user nodes corresponding to a plurality of second users, respectively, sharing one or more user attributes with the first user, and (2) a number of edges connecting the identified node to one or more nodes within the plurality of second nodes;
generating, by the computing device, one or more search results corresponding to one or more of the identified nodes, respectively, each search result comprising a reference to the corresponding identified node, wherein the one or more search results are biased toward nodes connected to disproportionately more nodes in the first set of user nodes than nodes in the plurality of second nodes; and
sending, from the computing device to the client device, responsive to the query, one or more search results for display to the first user, each search result corresponding to an identified node having a calculated score greater than a threshold score.
2 Assignments
0 Petitions
Accused Products
Abstract
In one embodiment, a method includes receiving a query, identifying one or more nodes of a plurality of second nodes corresponding to the query, calculating a score for each of the identified nodes using a probabilistic ranking model that scores each node based at least in part on a number of edges connecting the node to one or more nodes within a first set of user nodes that includes the first node and user nodes corresponding to second users sharing one or more user attributes with the first user, and generating corresponding search results. The score calculated for each of the identified nodes may bias the search results toward nodes connected to disproportionately more nodes in the first set of user nodes than nodes in the plurality of second nodes that correspond to an overall population of users of the online social network.
86 Citations
21 Claims
-
1. A method comprising, by a computing device:
-
accessing, by the computing device, a social graph comprising a plurality of nodes and a plurality of edges connecting the nodes, each of the edges between two of the nodes representing a single degree of separation between them; receiving, at the computing device from a client device of a first user, a structured query comprising references to one or more selected nodes from the plurality of nodes and one or more selected edges from the plurality of edges; identifying, by the computing device, one or more nodes of a plurality of second nodes corresponding to the structured query, wherein the plurality of second nodes correspond to an overall population of users associated with an online social network; calculating, by the computing device, a score for each of the identified nodes, wherein the score is calculated using a probabilistic ranking model that scores each identified node based at least in part on (1) a number of edges connecting the identified node to one or more nodes within a first set of user nodes, the first set of user nodes comprising a first node corresponding to the first user and a plurality of user nodes corresponding to a plurality of second users, respectively, sharing one or more user attributes with the first user, and (2) a number of edges connecting the identified node to one or more nodes within the plurality of second nodes; generating, by the computing device, one or more search results corresponding to one or more of the identified nodes, respectively, each search result comprising a reference to the corresponding identified node, wherein the one or more search results are biased toward nodes connected to disproportionately more nodes in the first set of user nodes than nodes in the plurality of second nodes; and sending, from the computing device to the client device, responsive to the query, one or more search results for display to the first user, each search result corresponding to an identified node having a calculated score greater than a threshold score.
-
-
2. The method of claim 1, wherein the score calculated for each of the identified nodes biases the search results toward nodes connected to disproportionately more nodes in the first set of user nodes than nodes in the plurality of second nodes that correspond to the overall population of users associated with the online social network.
-
3. The method of claim 2, wherein biasing the search results toward nodes connected to disproportionately more nodes in the first set of user nodes ranks the nodes connected to disproportionately more nodes higher than the nodes in the plurality of second nodes that correspond to the overall population of users.
-
4. The method of claim 1, wherein calculating a score for each of the identified nodes comprises, for each node:
-
determining a proportion of users in the first set of user nodes who are connected to the identified node by an edge of a particular type; determining a proportion of users in the overall population of users associated with the online social network who are connected to the identified node by an edge of the particular type; determining a ratio of the proportion of users in the first set of user nodes to the proportion of users in the overall population; and calculating the score based on the ratio.
-
-
5. The method of claim 1, further comprising dampening the calculated scores for each of the identified nodes by subtracting a dampening value from the score of each of the identified nodes.
-
6. The method of claim 5, wherein the dampening value is a percentage of the total number of user nodes in the first set of user nodes.
-
7. The method of claim 1, further comprising identifying the plurality of second users by:
-
comparing one or more user attributes of the first user with one or more user attributes of each user of the overall population of users associated with the online social network; and including in the plurality of second users each user of the overall population of users having at least one user attribute that matches a user attribute of the first user.
-
-
8. The method of claim 1, wherein the one or more user attributes include user age, sex, gender, ethnicity, religion, current location, town lived in, home town, likes, friends, school attended, game played, music listened to, video watched, organization worked at, or a combination thereof.
-
9. The method of claim 1, wherein the one or more user attributes comprise one or more attribute data fields associated with the user node.
-
10. The method of claim 1, wherein the one or more user attributes comprise an attribute edge of an attribute edge type and an attribute node of an attribute node type, wherein the attribute node is connected to the user node by the attribute edge.
-
11. The method of claim 1, further comprising:
-
determining a sub-population proportion based on the number of user nodes in the first set of user nodes that are connected to the particular node by the particular type of edge; and determining an overall population proportion based on the number of user nodes corresponding to the overall population of users associated with the online social network that are connected to the particular node by the particular type of edge, wherein the user nodes corresponding to the plurality of second users are connected to the particular node by the particular type of edge in greater proportion than are user nodes corresponding to an overall population of users when the sub-population proportion is greater than the overall population proportion.
-
-
12. The method of claim 1, wherein the nodes comprise:
-
the first node corresponding to the first user associated with the online social network; and the plurality of second nodes that each correspond to a concept or a second user of the overall population of users associated with the online social network.
-
-
13. The method of claim 1, wherein the probabilistic ranking model comprises a probabilistic TF-IDF ranking model.
-
14. The method of claim 1, further comprising:
-
generating, by the computing device, a refined query based on the structured query, wherein the refined query comprises references to the one or more selected edges from the plurality of edges and the first set of user nodes, the first set of user nodes comprising the first node corresponding to the first user and a plurality of user nodes corresponding to a plurality of second users, respectively, sharing one or more user attributes with the first user; and identifying, by the computing device, one or more nodes of a plurality of second nodes corresponding to the refined query, wherein the plurality of second nodes correspond to the overall population of users associated with the online social network.
-
-
15. One or more computer-readable non-transitory storage media embodying software that is operable when executed to:
-
access a social graph comprising a plurality of nodes and a plurality of edges connecting the nodes, each of the edges between two of the nodes representing a single degree of separation between them; receive, from a client device of a first user, a structured query comprising references to one or more selected nodes from the plurality of nodes and one or more selected edges from the plurality of edges; identify one or more nodes of a plurality of second nodes corresponding to the structured query, wherein the plurality of second nodes correspond to an overall population of users associated with an online social network; calculate a score for each of the identified nodes, wherein the score is calculated using a probabilistic ranking model that scores each identified node based at least in part on (1) a number of edges connecting the identified node to one or more nodes within a first set of user nodes, the first set of user nodes comprising a first node corresponding to the first user and a plurality of user nodes corresponding to a plurality of second users, respectively, sharing one or more user attributes with the first user, and (2) a number of edges connecting the identified node to one or more nodes within the plurality of second nodes; generate one or more search results corresponding to one or more of the identified nodes, respectively, each search result comprising a reference to the corresponding identified node, wherein the one or more search results are biased toward nodes connected to disproportionately more nodes in the first set of user nodes than nodes in the plurality of second nodes; and send, to the client device, responsive to the query, one or more search results for display to the first user, each search result corresponding to an identified node having a calculated score greater than a threshold score.
-
-
16. The media of claim 15, wherein the software is further operable when executed to calculate a score for each of the identified nodes by scoring nodes connected to disproportionately more nodes higher than nodes in the plurality of second nodes that correspond to the overall population of users associated with the online social network.
-
17. The media of claim 15, wherein the software is further operable when executed to calculate a score for each of the identified nodes by, for each node:
-
determining a proportion of users in the first set of user nodes who are connected to the identified node by an edge of a particular type; determining a proportion of users in the overall population of users associated with the online social network who are connected to the identified node by an edge of the particular type; determining a ratio of the proportion of users in the first set of user nodes to the proportion of users in the overall population; and calculating the score based on the ratio.
-
-
18. The media of claim 15, wherein the software is further operable when executed to identify the plurality of second users by:
-
comparing one or more user attributes of the first user with one or more user attributes of each user of an overall population of users associated with the online social network; and including in the plurality of second users each user of the overall population of users having at least one user attribute that matches a user attribute of the first user.
-
-
19. The media of claim 15, wherein the one or more user attributes include user age, sex, gender, ethnicity, religion, current location, town lived in, home town, likes, friends, school attended, game played, music listened to, video watched, organization worked at, or a combination thereof.
-
20. The media of claim 15, wherein the software is further operable when executed to:
-
determine a sub-population proportion based on the number of user nodes in the first set of user nodes that are connected to the particular node by the particular type of edge; and determine an overall population proportion based on the number of user nodes corresponding to the overall population of users associated with the online social network that are connected to the particular node by the particular type of edge, wherein the user nodes corresponding to the plurality of second users are connected to the particular node by the particular type of edge in greater proportion than are user nodes corresponding to an overall population of users when the sub-population proportion is greater than the overall population proportion.
-
-
21. A system comprising:
- one or more processors; and
a memory coupled to the processors comprising instructions executable by the processors, the processors operable when executing the instructions to;access a social graph comprising a plurality of nodes and a plurality of edges connecting the nodes, each of the edges between two of the nodes representing a single degree of separation between them; receive, from a client device of a first user, a structured query comprising references to one or more selected nodes from the plurality of nodes and one or more selected edges from the plurality of edges; identify one or more nodes of a plurality of second nodes corresponding to the structured query, wherein the plurality of second nodes correspond to an overall population of users associated with an online social network; calculate a score for each of the identified nodes, wherein the score is calculated using a probabilistic ranking model that scores each identified node based at least in part on (1) a number of edges connecting the identified node to one or more nodes within a first set of user nodes, the first set of user nodes comprising a first node corresponding to the first user and a plurality of user nodes corresponding to a plurality of second users, respectively, sharing one or more user attributes with the first user, and (2) a number of edges connecting the identified node to one or more nodes within the plurality of second nodes; generate one or more search results corresponding to one or more of the identified nodes, respectively, each search result comprising a reference to the corresponding identified node, wherein the one or more search results are biased toward nodes connected to disproportionately more nodes in the first set of user nodes than nodes in the plurality of second nodes; and send, to the client device, responsive to the query, one or more search results for display to the first user, each search result corresponding to an identified node having a calculated score greater than a threshold score.
- one or more processors; and
Specification