Granular Forward Indexes on Online Social Networks
First Claim
1. A method comprising, by one or more computing devices of an online social network:
- receiving, from a client device of a user of the online social network, a search query comprising one or more n-grams;
accessing a search index comprising a forward index and an inverted index each having one or more records, wherein each record of the forward index comprises;
one or more first fields corresponding to one or more tokens of user-inputted content of an object; and
one or more second fields corresponding to one or more tokens of third-party content linked to the object;
searching the inverted index to identify one or more objects having one or more tokens that match one or more of the n-grams of the search query;
scoring each identified object based at least in part on whether the tokens of the object match the n-grams of the search query correspond to one of the first fields or one of the second fields; and
sending, to the client device of the user in response to the received search query, a search-results page comprising one or more search results for display to the user, wherein each search result references an identified object having a score greater than a threshold score.
3 Assignments
0 Petitions
Accused Products
Abstract
In one embodiment, a social-networking system may access an enhanced search index of an online social network. The enhanced search index may include data from a social graph having a plurality of nodes and a plurality of edges connecting the nodes, where the nodes comprise a plurality of internal nodes corresponding to entities associated with the online social network, and a plurality of external nodes corresponding to objects associated with a third-party system. The social-networking system may then search the enhanced search index in response to a query received from a user to identify objects that substantially match the query. Each identified object may be scored by the social-networking system based at least in part on a connectivity of the corresponding external node to the one or more internal nodes. In response to the query, the social-networking system may send a search-results page referencing objects based on their scores.
31 Citations
18 Claims
-
1. A method comprising, by one or more computing devices of an online social network:
-
receiving, from a client device of a user of the online social network, a search query comprising one or more n-grams; accessing a search index comprising a forward index and an inverted index each having one or more records, wherein each record of the forward index comprises; one or more first fields corresponding to one or more tokens of user-inputted content of an object; and one or more second fields corresponding to one or more tokens of third-party content linked to the object; searching the inverted index to identify one or more objects having one or more tokens that match one or more of the n-grams of the search query; scoring each identified object based at least in part on whether the tokens of the object match the n-grams of the search query correspond to one of the first fields or one of the second fields; and sending, to the client device of the user in response to the received search query, a search-results page comprising one or more search results for display to the user, wherein each search result references an identified object having a score greater than a threshold score.
-
-
2. The method of claim 1, further comprising:
-
determining, for each identified object, whether the user has permission to view the user-inputted content of the identified object based on the privacy setting associated with the user-inputted content; and filtering identified objects by removing a reference of identified objects from the search-results page based on determining the user is denied permission to view the user-inputted content.
-
-
3. The method of claim 2, wherein determining whether the user has permission to view the user-inputted content of each identified object comprises:
-
determining whether one or more tokens matching the n-grams of the search query correspond to one or more of the tokens of the first fields; and accessing a privacy setting of one or more users associated with the user-inputted content.
-
-
4. The method of claim 1, wherein scoring each identified object is further based on a percentage of tokens of the record corresponding to the identified object that match the n-grams of the search query.
-
5. The method of claim 1, wherein scoring each identified object comprises:
-
accessing the forward index for each identified object; and determining, for each identified object, a number of the tokens of the record corresponding to the identified object that match the n-grams of the search query relative to a total number of n-grams of the search query.
-
-
6. The method of claim 1, wherein the scoring further comprises determining a weighting for matching first fields tokens relative to matching second field tokens of the record corresponding to the identified object.
-
7. The method of claim 6, wherein the determination of the weighting is performed using a machine learning algorithm.
-
8. The method of claim 6, wherein the determination of the weighting is based on a clickthrough rate (CTR) of the references of the search-results page.
-
9. The method of claim 1, wherein fields of each record of the inverted index comprises:
-
a token-field corresponding to a token of the user-inputted content or a token of the third-party content; and an identifier-field corresponding to identifying information of one or more objects with a portion of its content matching the token of the inverted index record.
-
-
10. The method of claim 9, further comprising:
-
identifying one or more of the inverted index records based on matching one or more of the n-grams to the token of one or more of the inverted index records; and generating one or more of the search results based at least in part on identifying information of the identified inverted index records.
-
-
11. The method of claim 1, wherein:
-
the object is a post on the online social network; the third-party content comprises an embedded article; one or more of the tokens of the first fields correspond to user content of the post, a reshared post, identifying information of one or more other users, social signals or any combination thereof; and one or more of the tokens of the second fields correspond to textual content, title, description, position offset, author identifier, or any combination thereof.
-
-
12. The method of claim 11, wherein the social signals comprise one or more likes, reshares, comments, or any combination thereof.
-
13. The method of claim 11, wherein the position offset indicates a location of a secondary title of the third-party embedded article.
-
14. The method of claim 1, wherein the third-party content is stored in a structure that is separate from the object.
-
15. The method of claim 1, wherein the user-inputted content is associated with a post, a comment, or a social-graph tag.
-
16. The method of claim 1, further comprising generating a query command comprising one or more query constraints, wherein one or more of the query constraints comprise matching the n-grams of the search query to the tokens of the first fields and the tokens of the second fields.
-
17. One or more computer-readable non-transitory storage media embodying software that is operable when executed to:
-
receive, from a client device of a user of the online social network, a search query comprising one or more n-grams; access a search index comprising a forward index and an inverted index each having one or more records, wherein each record of the forward index comprises; one or more first fields corresponding to one or more tokens of user-inputted content of an object; and one or more second fields corresponding to one or more tokens of third-party content linked to the object; search the inverted index to identify one or more objects having one or more tokens that match one or more of the n-grams of the search query; score each identified object based at least in part on whether the tokens of the object match the n-grams of the search query correspond to one of the first fields or one of the second fields; and send, to the client device of the user in response to the received search query, a search-results page comprising one or more search results for display to the user, wherein each search result references an identified object having a score greater than a threshold score.
-
-
18. A system comprising:
- one or more processors; and
a memory coupled to the processors comprising instructions executable by the processors, the processors operable when executing the instructions to;receive, from a client device of a user of the online social network, a search query comprising one or more n-grams; access a search index comprising a forward index and an inverted index each having one or more records, wherein each record of the forward index comprises; one or more first fields corresponding to one or more tokens of user-inputted content of an object; and one or more second fields corresponding to one or more tokens of third-party content linked to the object; search the inverted index to identify one or more objects having one or more tokens that match one or more of the n-grams of the search query; score each identified object based at least in part on whether the tokens of the object match the n-grams of the search query correspond to one of the first fields or one of the second fields; and send, to the client device of the user in response to the received search query, a search-results page comprising one or more search results for display to the user, wherein each search result references an identified object having a score greater than a threshold score.
- one or more processors; and
Specification