Social-based spelling correction for online social networks
First Claim
1. A method comprising, by one or more computing devices:
- receiving, from a client device of a user of an online social network, a search query comprising one or more n-grams;
determining, by a misspelled classifier component, for each n-gram, if a first bloom filter indicates the n-gram does exist or does not exist in a first set of object names associated with a first vertical;
identifying, by a variant-tokens generation component, for each n-gram that does not exist in the first set of object names, one or more variant-tokens based at least on the first bloom filter and the first set of object names;
generating, by a phrase selection component, one or more unique combinations of the n-grams and variant-tokens, where each unique combination includes;
a token corresponding to each n-gram that does exist in the first set of object names; and
a variant-token corresponding to each n-gram that does not exist in the first set of object names for the n-gram;
calculating, by a phrase confidence scoring component, a confidence score for each unique combination based at least in part on the search query and whether the unique combination exists in the first set of object names;
identifying objects matching each unique combination having a confidence score greater than a threshold confidence score; and
sending, to the client device of the user, a search-results page responsive to the search query, the search-results page comprising one or more references to one or more of the identified objects, respectively.
2 Assignments
0 Petitions
Accused Products
Abstract
In one embodiment, a method includes receiving a search query, determining if a bloom filter indicates an n-gram of the query does not exist in a set of object names associated with a vertical, identifying variant-tokens for each n-gram that does not exist in the set of object names, generating unique combinations of the n-grams and variant-tokens, where each unique combination includes a variant-token corresponding to each n-gram that does not exist in the set of object names for the n-gram, calculating a confidence score for each unique combination based at least in part on the search query and whether the unique combination exists in the set of object names, identifying objects matching each unique combination, where the unique combination has a confidence score greater than a threshold confidence score, and sending a search-results page responsive to the search query to the client device of the user.
-
Citations
20 Claims
-
1. A method comprising, by one or more computing devices:
-
receiving, from a client device of a user of an online social network, a search query comprising one or more n-grams; determining, by a misspelled classifier component, for each n-gram, if a first bloom filter indicates the n-gram does exist or does not exist in a first set of object names associated with a first vertical; identifying, by a variant-tokens generation component, for each n-gram that does not exist in the first set of object names, one or more variant-tokens based at least on the first bloom filter and the first set of object names; generating, by a phrase selection component, one or more unique combinations of the n-grams and variant-tokens, where each unique combination includes; a token corresponding to each n-gram that does exist in the first set of object names; and a variant-token corresponding to each n-gram that does not exist in the first set of object names for the n-gram; calculating, by a phrase confidence scoring component, a confidence score for each unique combination based at least in part on the search query and whether the unique combination exists in the first set of object names; identifying objects matching each unique combination having a confidence score greater than a threshold confidence score; and sending, to the client device of the user, a search-results page responsive to the search query, the search-results page comprising one or more references to one or more of the identified objects, respectively. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. One or more computer-readable non-transitory storage media embodying software that is operable when executed to:
-
receive, from a client device of a user of an online social network, a search query comprising one or more n-grams; determine, by a misspelled classifier component, for each n-gram, if a first bloom filter indicates the n-gram does exist or does not exist in a first set of object names associated with a first vertical; identify, by a variant-tokens generation component, for each n-gram that does not exist in the first set of object names, one or more variant-tokens based at least on the first bloom filter and the first set of object names; generate, by a phrase selection component, one or more unique combinations of the n-grams and variant-tokens, where each unique combination includes; a token corresponding to each n-gram that does exist in the first set of object names; and a variant-token corresponding to each n-gram that does not exist in the first set of object names for the n-gram; calculate, by a phrase confidence scoring component, a confidence score for each unique combination based at least in part on the search query and whether the unique combination exists in the first set of object names; identify objects matching each unique combination having a confidence score greater than a threshold confidence score; and send, to the client device of the user, a search-results page responsive to the search query, the search-results page comprising one or more references to one or more of the identified objects, respectively.
-
-
20. A system comprising:
- one or more processors; and
a memory coupled to the processors comprising instructions executable by the processors, the processors operable when executing the instructions to;receive, from a client device of a user of an online social network, a search query comprising one or more n-grams; determine, by a misspelled classifier component, for each n-gram, if a first bloom filter indicates the n-gram does exist or does not exist in a first set of object names associated with a first vertical; identify, by a variant-tokens generation component, for each n-gram that does not exist in the first set of object names, one or more variant-tokens based at least on the first bloom filter and the first set of object names; generate, by a phrase selection component, one or more unique combinations of the n-grams and variant-tokens, where each unique combination includes; a token corresponding to each n-gram that does exist in the first set of object names; and a variant-token corresponding to each n-gram that does not exist in the first set of object names for the n-gram; calculate, by a phrase confidence scoring component, a confidence score for each unique combination based at least in part on the search query and whether the unique combination exists in the first set of object names; identify objects matching each unique combination having a confidence score greater than a threshold confidence score; and send, to the client device of the user, a search-results page responsive to the search query, the search-results page comprising one or more references to one or more of the identified objects, respectively.
- one or more processors; and
Specification