Tensor-based deep relevance model for search on online social networks
First Claim
1. A method comprising, by a computing system:
- receiving, from a client system, a search query comprising a plurality of query terms;
generating a query match-matrix for the search query, wherein a first dimension of the query match-matrix corresponds to the query terms in the search query and a second dimension of the query match-matrix corresponds to n-dimensional embeddings representing the query terms in the search query, respectively, in an n-dimensional embedding space;
identifying a plurality of objects matching the search query;
retrieving, for each identified object, an object match-matrix for the identified object, wherein a first dimension of the object match-matrix corresponds to terms appearing in a text content of the object and a second dimension of the object match-matrix corresponds to n-dimensional embeddings representing the terms in the text content of the object, respectively, in the n-dimensional embedding space;
constructing, for each identified object, a three-dimensional tensor for the identified object by taking an element-wise product of the query match-matrix for the search query and the object match-matrix for the identified object, wherein a first dimension of the tensor corresponds to the query terms in the search query, a second dimension of the tensor corresponds to terms appearing in the text content of the object, and a third dimension of the tensor corresponds to the predetermined number of match channels, wherein each match channel calculates a weighted match similarity between the query and the object text, wherein the weighting for each channel is based on state-specific signals of the query and object text;
computing, for each identified object, a relevance score based on the tensor for the identified object, wherein the relevance score represents a degree of relevance between the search query and the object;
ranking the identified objects based on their respective relevance scores; and
sending, to the first client system in response to the search query, instructions for generating a search-results interface for presentation to the first user, the search-results interface comprising references to one or more of the identified objects presented in ranked order.
4 Assignments
0 Petitions
Accused Products
Abstract
In one embodiment, a method includes receiving, from a client system associated with a user, a search query comprising a number of query terms, generating a query match-matrix for the search query, identifying a number of objects matching the search query, retrieving, for each identified object, an object match-matrix for the identified object, constructing, for each identified object, a three-dimensional tensor for the identified object, computing, for each identified object, a relevance score based on the tensor for the identified object, ranking the identified objects based on their respective relevance scores, and sending, to the first client system in response to the search query, instructions for generating a search-results interface for presentation to the user.
198 Citations
20 Claims
-
1. A method comprising, by a computing system:
-
receiving, from a client system, a search query comprising a plurality of query terms; generating a query match-matrix for the search query, wherein a first dimension of the query match-matrix corresponds to the query terms in the search query and a second dimension of the query match-matrix corresponds to n-dimensional embeddings representing the query terms in the search query, respectively, in an n-dimensional embedding space; identifying a plurality of objects matching the search query; retrieving, for each identified object, an object match-matrix for the identified object, wherein a first dimension of the object match-matrix corresponds to terms appearing in a text content of the object and a second dimension of the object match-matrix corresponds to n-dimensional embeddings representing the terms in the text content of the object, respectively, in the n-dimensional embedding space; constructing, for each identified object, a three-dimensional tensor for the identified object by taking an element-wise product of the query match-matrix for the search query and the object match-matrix for the identified object, wherein a first dimension of the tensor corresponds to the query terms in the search query, a second dimension of the tensor corresponds to terms appearing in the text content of the object, and a third dimension of the tensor corresponds to the predetermined number of match channels, wherein each match channel calculates a weighted match similarity between the query and the object text, wherein the weighting for each channel is based on state-specific signals of the query and object text; computing, for each identified object, a relevance score based on the tensor for the identified object, wherein the relevance score represents a degree of relevance between the search query and the object; ranking the identified objects based on their respective relevance scores; and sending, to the first client system in response to the search query, instructions for generating a search-results interface for presentation to the first user, the search-results interface comprising references to one or more of the identified objects presented in ranked order.
-
-
2. The method of claim 1, wherein the generating the query match-matrix for the search query comprises:
-
generating a plurality of term-embeddings associated with the plurality of query terms, respectively, based on a prepared word-embedding table, wherein each of the term-embeddings corresponds to a point in a d-dimensional embedding space; and producing a query match-matrix for the search query by encoding the generated term-embeddings with a neural network, wherein the query match-matrix represents contextual meanings of the terms in the query, respectively, based on neighboring words and words located far behind or far ahead of the terms.
-
-
3. The method of claim 2, further comprises adjusting a size of the second dimension of the query match-matrix by performing a linear projection of the query match-matrix.
-
4. The method of claim 2, wherein the neural network is a bi-directional Long Short-Term Memory (LSTM) network comprising a series of states connected in forward and backward directions, wherein each state takes a term embedding for a respective term in the search query as an input and produces an encoded term embedding reflecting the contextual meaning of the corresponding term in the search query as an output by processing input term embedding and signals from both neighboring states.
-
5. The method of claim 2, wherein the prepared word-embedding table is created using a word-embedding model based on text contents of a plurality of objects created during a predetermined period of time.
-
6. The method of claim 2, wherein the prepared word-embedding table comprises unigrams and a plurality of selected bigrams.
-
7. The method of claim 2, wherein the word embedding model is word2vec model.
-
8. The method of claim 1, wherein identifying the plurality of objects comprises identifying objects containing text in their respective text content that matches one or more of the query terms.
-
9. The method of claim 1, wherein identifying the plurality of objects comprises:
-
identifying a set of candidate objects stored in one or more data stores; retrieving, for each candidate object, an object match-matrix associated with the candidate object; computing, for each candidate object, a similarity score representing a degree of similarity between the retrieved object match-matrix for the candidate object and the query match-matrix for the search query by comparing the object match-matrix and the query match-matrix; and identifying objects that have the similarity score higher than a threshold.
-
-
10. The method of claim 9, further comprising:
-
receiving a request to post a first object to the computing system; constructing an object match-matrix for the first object; and storing the object match-matrix in the one or more data stores, wherein the object with a link to the object match-matrix is stored in the one or more data stores.
-
-
11. The method of claim 10, wherein constructing an object match-matrix for the first object comprises:
-
generating a plurality of term-embeddings associated with a plurality of terms in the text content of the first object, respectively, based on a prepared word-embedding table, wherein each of the term-embeddings corresponds to a point in a d-dimensional embedding space; and producing the object match-matrix for the first object by encoding the generated term-embeddings with a neural network, wherein the object match-matrix represents contextual meanings of the terms in the text content of the first object, respectively, based on neighboring words and words located far behind or far ahead of the terms.
-
-
12. The method of claim 1, further comprising:
appending, to each tensor, an exact-match channel, wherein an entry at position (i,j) of the exact-match channel is set to a non-zero value if an i-th term in the search query is an exact match to a j-th term in the text of the object and set to a zero value otherwise.
-
13. The method of claim 12, wherein the non-zero value is determined through a backpropagation process.
-
14. The method of claim 1, wherein computing the relevance score for each identified object based on the tensor for the identified object comprises:
-
generating a first three-dimensional matrix by performing a first series of convolutions on the tensor with one or more sets of first-convolution filters; applying a Rectified Linear Unit (ReLU) activation function to the first three-dimensional matrix; generating a second three-dimensional matrix by performing a second series of convolutions with a plurality of second-convolution filters on the first three-dimensional matrix; constructing a predetermined size vector by performing a max-pooling procedure on the second three-dimensional matrix; and calculating a relevance score by performing a sigmoid activation on the vector.
-
-
15. The method of claim 1, wherein each of the one or more sets of the first-convolution filters comprises a plurality of n-by-m-by-k first-convolution filters, wherein n is a first dimension size of the filter, the first dimension corresponding to the query terms, m is a second dimension size of the filter, the second dimension corresponding to the terms in the text content of the object, and k is a third dimension size of the filter, the third dimension corresponding to the match channels, wherein k is equal to the number of match channels of the tensor.
-
16. The method of claim 15, wherein a size of the second-convolution filters is 1-by-1-by-k′
- , where k′
is equal to a size of a third dimension of the first three-dimensional matrix, wherein the third dimension of the first three-dimensional matrix corresponds to convolution layers, wherein each convolution layer comprises output of convolutions with a first-convolution filter.
- , where k′
-
17. The method of claim 15, wherein a third dimension of the second three-dimensional matrix corresponds to convolution layers, wherein each convolution layer comprises output of convolutions with a second-convolution filter, wherein constructing the predetermined size vector by performing a max-pooling procedure comprises:
-
choosing, for each convolution layer of the second three-dimensional matrix, a maximum value; and filling a corresponding element of the vector with the chosen value.
-
-
18. The method of claim 15, wherein the sigmoid activation on the vector produces a real-number score between 0 and 1.
-
19. One or more computer-readable non-transitory storage media embodying software that is operable when executed to:
-
receive, from a client system, a search query comprising a plurality of query terms; generate a query match-matrix for the search query, wherein a first dimension of the query match-matrix corresponds to the query terms in the search query and a second dimension of the query match-matrix corresponds to n-dimensional embeddings representing the query terms in the search query, respectively, in an n-dimensional embedding space; identify a plurality of objects matching the search query; retrieve, for each identified object, an object match-matrix for the identified object, wherein a first dimension of the object match-matrix corresponds to terms appearing in a text content of the object and a second dimension of the object match-matrix corresponds to n-dimensional embeddings representing the terms in the text content of the object, respectively, in the n-dimensional embedding space; construct, for each identified object, a three-dimensional tensor for the identified object by taking an element-wise product of the query match-matrix for the search query and the object match-matrix for the identified object, wherein a first dimension of the tensor corresponds to the query terms in the search query, a second dimension of the tensor corresponds to terms appearing in the text content of the object, and a third dimension of the tensor corresponds to the predetermined number of match channels, wherein each match channel calculates a weighted match similarity between the query and the object text, wherein the weighting for each channel is based on state-specific signals of the query and object text; compute, for each identified object, a relevance score based on the tensor for the identified object, wherein the relevance score represents a degree of relevance between the search query and the object; rank the identified objects based on their respective relevance scores; and send, to the first client system in response to the search query, instructions for generating a search-results interface for presentation to the first user, the search-results interface comprising references to one or more of the identified objects presented in ranked order.
-
-
20. A system comprising:
- one or more processors; and
a non-transitory memory coupled to the processors comprising instructions executable by the processors, the processors operable when executing the instructions to;receive, from a client system, a search query comprising a plurality of query terms; generate a query match-matrix for the search query, wherein a first dimension of the query match-matrix corresponds to the query terms in the search query and a second dimension of the query match-matrix corresponds to n-dimensional embeddings representing the query terms in the search query, respectively, in an n-dimensional embedding space; identify a plurality of objects matching the search query; retrieve, for each identified object, an object match-matrix for the identified object, wherein a first dimension of the object match-matrix corresponds to terms appearing in a text content of the object and a second dimension of the object match-matrix corresponds to n-dimensional embeddings representing the terms in the text content of the object, respectively, in the n-dimensional embedding space; construct, for each identified object, a three-dimensional tensor for the identified object by taking an element-wise product of the query match-matrix for the search query and the object match-matrix for the identified object, wherein a first dimension of the tensor corresponds to the query terms in the search query, a second dimension of the tensor corresponds to terms appearing in the text content of the object, and a third dimension of the tensor corresponds to the predetermined number of match channels, wherein each match channel calculates a weighted match similarity between the query and the object text, wherein the weighting for each channel is based on state-specific signals of the query and object text; compute, for each identified object, a relevance score based on the tensor for the identified object, wherein the relevance score represents a degree of relevance between the search query and the object; rank the identified objects based on their respective relevance scores; and send, to the first client system in response to the search query, instructions for generating a search-results interface for presentation to the first user, the search-results interface comprising references to one or more of the identified objects presented in ranked order.
- one or more processors; and
Specification