Real time content searching in social network
First Claim
1. A computer implemented method comprising:
- receiving a query comprising a term from a user;
selecting from a plurality of partitions of a user-term index, partitions of the user-term index that are associated with connections of the user in a social networking system, the user-term index comprising time-ordered database shards of records, where data in the user-term index is arranged by user identifier and includes a plurality of user identifiers associated with a plurality of users of a social networking system and a plurality of term identifiers associated with a plurality of terms used by the plurality of users;
matching the term identifier to corresponding term identifiers in the selected partitions, the selected partitions including user identifiers associated with connections of the user, and wherein the matching identifies post identifiers for posts that include the term and are associated with a connection of the user; and
retrieving posts from an index using the identified post identifiers, the retrieved posts for presentation to the user.
1 Assignment
0 Petitions
Accused Products
Abstract
Indexing and retrieving real time content in a social networking system is disclosed. A user-term index includes user-term partitions, each user-term partition comprising temporal databases. As a post is received from a user, a user identifier, a post identifier, and a post is extracted. An object store communicatively coupled to a temporal database for recently received content is queried to determine whether terms in the post has already been stored. A term identifier is stored in the user-term index with the user and post identifiers. A forward index stores the post by post identifier. Responsive to a search query, the user-term index is searched by the user'"'"'s connections and the terms. A real time search engine compiles the results of the user-term index query and retrieves the stored posts from the forward index. The search results may then be ranked and cached before presentation to the searching user.
-
Citations
20 Claims
-
1. A computer implemented method comprising:
-
receiving a query comprising a term from a user; selecting from a plurality of partitions of a user-term index, partitions of the user-term index that are associated with connections of the user in a social networking system, the user-term index comprising time-ordered database shards of records, where data in the user-term index is arranged by user identifier and includes a plurality of user identifiers associated with a plurality of users of a social networking system and a plurality of term identifiers associated with a plurality of terms used by the plurality of users; matching the term identifier to corresponding term identifiers in the selected partitions, the selected partitions including user identifiers associated with connections of the user, and wherein the matching identifies post identifiers for posts that include the term and are associated with a connection of the user; and retrieving posts from an index using the identified post identifiers, the retrieved posts for presentation to the user.
-
-
2. The computer implemented method of claim 1, further comprising:
-
performing a hash function on the term to generate a term identifier; querying for the term identifier in an object store associated with the most recent database shard in the selected partition of the user-term index; and allocating memory in the object store for the term identifier responsive to not finding the term identifier in the object store.
-
-
3. The computer implemented method of claim 1, wherein selecting from a plurality of partitions of a user-term index, partitions of the user-term index that are associated with connections of the user in a social networking system comprises:
-
performing a hash function to associate the user identifier to a particular partition of the user-term index; and selecting the particular partition of the user-term index.
-
-
4. The computer implemented method of claim 3, wherein the hash function takes a modulo of the user identifier by the number of partitions.
-
5. The method computer implemented of claim 1, wherein the time-ordered database shards of records comprises a shard for each day of the month, a shard for each month of the year, or a shard for each hour of the day.
-
6. The computer implemented method of claim 1, wherein the number of shards fluctuates over time.
-
7. The computer implemented method of claim 1, further comprising:
-
determining that all of the shards are filled to capacity; and creating a new shard, and setting the new shard as the most recent database shard.
-
-
8. The computer implemented method of claim 1, further comprising:
-
deleting the oldest shard, of the time-ordered database shards of records; and deleting an object store associated with the oldest shard.
-
-
9. A computer implemented method comprising:
-
selecting from a plurality of partitions of a user-term index, partitions of the user-term index that are associated with connections of a user in a social networking system, the user-term index comprising time-ordered database shards of records, where data in the user-term index is arranged by user identifier and includes a plurality of user identifiers associated with a plurality of users of a social networking system and a plurality of term identifiers associated with a plurality of terms used by the plurality of users; matching the term identifier to corresponding term identifiers in the selected partitions, the selected partitions including user identifiers associated with connections of the user, and wherein the matching identifies post identifiers for posts that include the term and are associated with a connection of the user; and retrieving posts from an index using the identified post identifiers, the retrieved posts for presentation to the user.
-
-
10. The computer implemented method of claim 9, further comprising:
-
performing a hash function on the term to generate a term identifier; querying for the term identifier in an object store associated with the most recent database shard in the selected partition of the user-term index; and allocating memory in the object store for the term identifier responsive to not finding the term identifier in the object store.
-
-
11. The computer implemented method of claim 9, wherein selecting from a plurality of partitions of a user-term index, partitions of the user-term index that are associated with connections of the user in a social networking system comprises:
-
performing a hash function to associate the user identifier to a particular partition of the user-term index; and selecting the particular partition of the user-term index.
-
-
12. The computer implemented method of claim 11, wherein the hash function takes a modulo of the user identifier by the number of partitions.
-
13. The computer implemented method of claim 9, wherein the time-ordered database shards of records comprises a shard for each day of the month, a shard for each month of the year, or a shard for each hour of the day.
-
14. The computer implemented method of claim 9, further comprising:
-
determining that all of the shards are filled to capacity; and creating a new shard, and setting the new shard as the most recent database shard.
-
-
15. The computer implemented method of claim 14, further comprising:
-
deleting the oldest shard, of the time-ordered database shards of records; and deleting an object store associated with the oldest shard.
-
-
16. A computer program product comprising a non-transitory computer-readable storage medium having instructions encoded thereon that, when executed by a processor, cause the processor to perform steps comprising:
-
receiving a query comprising a term from a user; selecting from a plurality of partitions of a user-term index, partitions of the user-term index that are associated with connections of the user in a social networking system, the user-term index comprising time-ordered database shards of records, where data in the user-term index is arranged by user identifier and includes a plurality of user identifiers associated with a plurality of users of a social networking system and a plurality of term identifiers associated with a plurality of terms used by the plurality of users; matching the term identifier to corresponding term identifiers in the selected partitions, the selected partitions including user identifiers associated with connections of the user, and wherein the matching identifies post identifiers for posts that include the term and are associated with a connection of the user; and retrieving posts from an index using the identified post identifiers, the retrieved posts for presentation to the user.
-
-
17. The computer program product of claim 16, wherein the instructions further cause the processor to perform the steps comprising:
-
performing a hash function on the term to generate the term identifier; querying for the term identifier in an object store associated with the most recent database shard in the selected partition of the user-term index; and allocating memory in the object store for the term identifier responsive to not finding the term identifier in the object store.
-
-
18. The computer program product of claim 16, wherein selecting from a plurality of partitions of a user-term index, partitions of the user-term index that are associated with connections of the user in a social networking system comprises:
-
performing a hash function to associate the user identifier to a particular partition of the user-term index; and selecting the particular partition of the user-term index.
-
-
19. The computer program product of claim 18, wherein the hash function takes a modulo of the user identifier by the number of partitions.
-
20. The computer program product of claim 16, wherein the time-ordered database shards of records comprises a shard for each day of the month, a shard for each month of the year, or a shard for each hour of the day.
Specification