REAL TIME CONTENT SEARCHING IN SOCIAL NETWORK
First Claim
1. A method for indexing information received by a social networking system, the method comprising:
- receiving a post from a user;
determining a user identifier associated with the user, a post identifier associated with the post, and a term identifier associated with a term in the post;
selecting a partition of a user-term index that is associated with the user identifier from among a plurality of partitions of the user-term index; and
indexing the term of the post and the post identifier into the selected partition of the user-term index based upon the user identifier and term identifier.
2 Assignments
0 Petitions
Accused Products
Abstract
Indexing and retrieving real time content in a social networking system is disclosed. A user-term index includes user-term partitions, each user-term partition comprising temporal databases. As a post is received from a user, a user identifier, a post identifier, and a post is extracted. An object store communicatively coupled to a temporal database for recently received content is queried to determine whether terms in the post has already been stored. A term identifier is stored in the user-term index with the user and post identifiers. A forward index stores the post by post identifier. Responsive to a search query, the user-term index is searched by the user'"'"'s connections and the terms. A real time search engine compiles the results of the user-term index query and retrieves the stored posts from the forward index. The search results may then be ranked and cached before presentation to the searching user.
-
Citations
24 Claims
-
1. A method for indexing information received by a social networking system, the method comprising:
-
receiving a post from a user; determining a user identifier associated with the user, a post identifier associated with the post, and a term identifier associated with a term in the post; selecting a partition of a user-term index that is associated with the user identifier from among a plurality of partitions of the user-term index; and indexing the term of the post and the post identifier into the selected partition of the user-term index based upon the user identifier and term identifier.
-
-
2. The method of claim 1, wherein indexing the term of the post and the post identifier into the selected partition of the user-term index comprises:
-
selecting a record in a most recent database shard of the selected partition of the user-term index, the record comprising the user identifier, the term identifier, and a list of post identifiers; and adding the post identifier into the list of post identifiers of the selected record in the most recent database shard.
-
-
3. The method of claim 2, wherein determining the term identifier comprises:
-
performing a hash function on the term to generate the term identifier; querying for the term identifier in an object store associated with the most recent database shard in the selected partition of the user-term index; and allocating memory in the object store for the term identifier responsive to not finding the term identifier in the object store.
-
-
4. The method of claim 1, wherein selecting the partition of the user-term index that is associated with the user identifier comprises:
-
performing a hash function to associate the user identifier to a particular partition of the user-term index; selecting the particular partition of the user-term index.
-
-
5. A method for retrieving information stored in a social networking system, the method comprising:
-
receiving a query comprising a term from a user; determining a user identifier of the user and a term identifier associated with the term; gathering post identifiers that are associated with the term identifier in a user-term index, the user-term index comprising time-ordered database shards of records; and retrieving posts from a forward index based upon the gathered post identifiers for presentation to the user.
-
-
6. The method of claim 5, further comprising:
-
determining user identifiers of connections of the user; and selecting partitions that are associated with the user identifiers of the connections of the user from among a plurality of partitions of the user-term index, wherein gathering post identifiers comprises gathering post identifiers that are associated with the term identifier in the selected partitions of the user-term index.
-
-
7. The method of claim 5, wherein gathering the post identifiers that are associated with the term identifier in the user-term index comprises:
-
selecting a record in a most recent database shard of the user-term index, the record comprising the term identifier and a list of post identifiers; and compiling the lists of post identifiers.
-
-
8. The method of claim 6, wherein gathering the post identifiers that are associated with the term identifier in the selected partitions of the user-term index comprises:
-
for each of the selected partitions, selecting a record in a most recent database shard of the selected partition, the record comprising a user identifier matching a user identifier for at least one of the connections, the term identifier, and a list of post identifiers; and compiling the lists of post identifiers.
-
-
9. The method of claim 6, wherein selecting partitions that are associated with the user identifiers of the connections of the user comprises:
-
performing a hash function to associate each of the connections'"'"' user identifiers to respective partitions of the user-term index; and selecting the respective partitions of the user-term index.
-
-
10. The method of claim 5, wherein the presentation of the retrieved posts to the user comprises applying a ranking to the retrieved posts.
-
11. The method of claim 10, wherein the ranking comprises ranking the retrieved posts based at least in part on reputations of users.
-
12. The method of claim 10, wherein the ranking comprises ranking the retrieved posts based at least in part on popularity of users.
-
13. The method of claim 10, wherein the ranking comprises ranking the retrieved posts based at least in part on similarity of users.
-
14. The method of claim 10, wherein the ranking comprises ranking the retrieved posts based at least in part on proximity of users.
-
15. The method of claim 10, wherein the ranking comprises ranking the retrieved posts based at least in part on affinities.
-
16. The method of claim 5, further comprising:
-
storing the retrieved posts in a global cache; and responsive to receiving subsequent queries matching the query, retrieving the posts from the global cache.
-
-
17. The method of claim 6, wherein the connections comprise other users of the social networking system that are connected to the user.
-
18. The method of claim 5, wherein the presentation of the retrieved posts to the user is limited by privacy settings.
-
19. A system for indexing information received from users of a social networking system, the system comprising:
-
a server configured to receive a post from a user; a real time search engine comprising an indexing module configured to determine a post identifier of the post and a term identifier of a term in the post and an aggregator module configured to determine a user identifier of the user, a forward index configured to store the received post based upon the post identifier; and a user-term index comprising a plurality of partitions, wherein the aggregator module is further configured to select a partition of the user-term index that is associated with the user identifier from among the plurality of partitions of the user-term index and the indexing module is further configured to index the term of the post and the post identifier based upon the user identifier and the term identifier.
-
-
20. The system of claim 19, wherein the indexing module is further configured to:
-
select a record in a most recent database shard of the selected partition of the user-term index, the record comprising the user identifier, the term identifier, and a list of post identifiers; and add the post identifier into the list of post identifiers of the selected record in the most recent database shard.
-
-
21. A system for retrieving information in a social networking system, the system comprising:
-
a server configured to receive a query comprising a term from a user; a real time search engine, communicatively coupled to the server, comprising an aggregator module configured to determine a user identifier of the user and a term identifier associated with the term; a forward index configured to store posts based upon post identifiers; and a user-term index comprising time-ordered database shards of records, wherein the aggregator module is further configured to gather post identifiers that are associated the term identifier in the user-term index, and retrieve posts from the forward index based upon the gathered post identifiers for presentation to the user.
-
-
22. The system of claim 21, wherein the aggregator module is further configured to determine user identifiers of connections of the user and select partitions that are associated with the user identifiers of the connections of the user from among a plurality of partitions of the user-term index, wherein aggregator module gathers post identifiers that are associated with the term identifier in the selected partitions of the user-term index.
-
23. The system of claim 21, wherein the aggregator module is further configured to:
-
select a record in a most recent database shard of the user-term index, the record comprising the term identifier and a list of post identifiers; and compile the lists of post identifiers.
-
-
24. The system of claim 22, wherein the aggregator module is further configured to:
-
for each of the selected partitions, select a record in a most recent database shard of the selected partition, the record comprising a user identifier matching one of the user identifiers of the user'"'"'s connections, the term identifier, and a list of post identifiers; and compile the lists of post identifiers.
-
Specification