Iteratively learning coreference embeddings of noun phrases using feature representations that include distributed word representations of the noun phrases
First Claim
1. A computer implemented method useful for modifying a search query issued by a client device, comprising:
- identifying, by one or more computer systems, distributed word representations for a plurality of noun phrases, the distributed word representations indicative of syntactic and semantic features of the noun phrases;
determining, by one or more of the computer systems for each of one or more of the noun phrases and based on labeled data, at least one training pair of a referring feature representation and an antecedent feature representation, wherein;
the referring feature representation for the at least one training pair for a given noun phrase of the one or more noun phrases includes the distributed word representation for the given noun phrase, andthe antecedent feature representation for the at least one training pair for the given noun phrase includes the distributed word representation for the given noun phrase augmented by one or more antecedent features, wherein the one or more antecedent features include a parse tree distance for the given noun phrase as a candidate antecedent noun phrase in the labeled data, the parse tree distance being a parse tree based distance between the given noun phrase as the candidate antecedent noun phrase and a corresponding referring noun phrase;
wherein the referring feature representations are m-dimensional space vectors, the antecedent feature representations are n-dimensional space vectors, and wherein the m-dimensional space vectors vary in length from the n-dimensional space vectors;
learning, by one or more of the computer systems, coreference embeddings of the referring and antecedent feature representations of the noun phrases, the learning comprising iteratively embedding the m-dimensional space vectors and the n-dimensional space vectors into a common k-dimensional space;
identifying, by one or more of the computer systems after the learning of the coreference embeddings, a first text segment and a second text segment associated with the first text segment, wherein the second text segment is a search query issued by a client device of a user;
identifying, by one or more of the computer systems in the first text segment, an occurrence of one or more candidate antecedent noun phrases;
identifying, by one or more of the computer systems in the second text segment, an occurrence of the given noun phrase;
determining, by one or more of the computer systems for the given noun phrase, distance measures, in the common k-dimensional space, between the given noun phrase and the one or more candidate antecedent noun phrases based on inner products of the coreference embeddings in the common k-dimensional space;
determining, by one or more of the computer systems, for a candidate noun phrase of the candidate antecedent noun phrases, a score for the candidate noun phrase as an antecedent for the given noun phrase based on the distance measure between the given noun phrase and the candidate noun phrase;
selecting, by one or more of the computer systems, the candidate noun phrase as the antecedent for the given noun phrase based on the determined score;
modifying, by one or more of the computer systems, the search query issued by the client device, wherein modifying the search query comprises replacing the given noun phrase with the selected candidate noun phrase in response to selecting the candidate noun phrase as the antecedent for the given noun phrase; and
providing, by one or more of the computer systems in response to the search query issued by the client device, search results that are responsive to the modified query that replaces the given noun phrase with the selected candidate noun phrase.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods and apparatus related to determining coreference resolution using distributed word representations. Distributed word representations, indicative of syntactic and semantic features, may be identified for one or more noun phrases. For each of the one or more noun phrases, a referring feature representation and an antecedent feature representation may be determined, where the referring feature representation includes the distributed word representation, and the antecedent feature representation includes the distributed word representation augmented by one or more antecedent features. In some implementations the referring feature representation may be augmented by one or more referring features. Coreference embeddings of the referring and antecedent feature representations of the one or more noun phrases may be learned. Distance measures between two noun phrases may be determined based on the coreference embeddings.
-
Citations
18 Claims
-
1. A computer implemented method useful for modifying a search query issued by a client device, comprising:
-
identifying, by one or more computer systems, distributed word representations for a plurality of noun phrases, the distributed word representations indicative of syntactic and semantic features of the noun phrases; determining, by one or more of the computer systems for each of one or more of the noun phrases and based on labeled data, at least one training pair of a referring feature representation and an antecedent feature representation, wherein; the referring feature representation for the at least one training pair for a given noun phrase of the one or more noun phrases includes the distributed word representation for the given noun phrase, and the antecedent feature representation for the at least one training pair for the given noun phrase includes the distributed word representation for the given noun phrase augmented by one or more antecedent features, wherein the one or more antecedent features include a parse tree distance for the given noun phrase as a candidate antecedent noun phrase in the labeled data, the parse tree distance being a parse tree based distance between the given noun phrase as the candidate antecedent noun phrase and a corresponding referring noun phrase; wherein the referring feature representations are m-dimensional space vectors, the antecedent feature representations are n-dimensional space vectors, and wherein the m-dimensional space vectors vary in length from the n-dimensional space vectors; learning, by one or more of the computer systems, coreference embeddings of the referring and antecedent feature representations of the noun phrases, the learning comprising iteratively embedding the m-dimensional space vectors and the n-dimensional space vectors into a common k-dimensional space; identifying, by one or more of the computer systems after the learning of the coreference embeddings, a first text segment and a second text segment associated with the first text segment, wherein the second text segment is a search query issued by a client device of a user; identifying, by one or more of the computer systems in the first text segment, an occurrence of one or more candidate antecedent noun phrases; identifying, by one or more of the computer systems in the second text segment, an occurrence of the given noun phrase; determining, by one or more of the computer systems for the given noun phrase, distance measures, in the common k-dimensional space, between the given noun phrase and the one or more candidate antecedent noun phrases based on inner products of the coreference embeddings in the common k-dimensional space; determining, by one or more of the computer systems, for a candidate noun phrase of the candidate antecedent noun phrases, a score for the candidate noun phrase as an antecedent for the given noun phrase based on the distance measure between the given noun phrase and the candidate noun phrase; selecting, by one or more of the computer systems, the candidate noun phrase as the antecedent for the given noun phrase based on the determined score; modifying, by one or more of the computer systems, the search query issued by the client device, wherein modifying the search query comprises replacing the given noun phrase with the selected candidate noun phrase in response to selecting the candidate noun phrase as the antecedent for the given noun phrase; and providing, by one or more of the computer systems in response to the search query issued by the client device, search results that are responsive to the modified query that replaces the given noun phrase with the selected candidate noun phrase. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system useful for modifying a search query issued by a client device, the system including memory and one or more processors operable to execute instructions stored in the memory, comprising instructions to:
-
identify distributed word representations for one or more noun phrases, the distributed word representations indicative of syntactic and semantic features of the one or more noun phrases; determine, for each of the one or more noun phrases and based on labeled data, at least one training pair of a referring feature representation and an antecedent feature representation, wherein; the referring feature representation for the at least one training pair for a given noun phrase of the one or more noun phrases includes the distributed word representation for the given noun phrase, and the antecedent feature representation for the at least one training pair for the given noun phrase includes the distributed word representation for the given noun phrase augmented by one or more antecedent features, wherein the one or more antecedent features include a parse tree distance for the given noun phrase as a candidate antecedent noun phrase in the labeled data, the parse tree distance being a parse tree based distance between the given noun phrase as the candidate antecedent noun phrase and a corresponding referring noun phrase; wherein the referring feature representations are m-dimensional space vectors, the antecedent feature representations are n-dimensional space vectors, and wherein the m-dimensional space vectors vary in length from the n-dimensional space vectors; learn coreference embeddings of the referring and antecedent feature representations of the one or more noun phrases based on iteratively embedding the m-dimensional space vectors and the n-dimensional space vectors into a common k-dimensional space; identify, after the learning of the coreference embeddings, a first text segment and a second text segment associated with the first text segment, wherein the second text segment is a search query issued by a client device of a user; identify, in the first text segment, an occurrence of one or more candidate antecedent noun phrases; identify, in the second text segment, an occurrence of the given noun phrase; determine, for the given noun phrase, distance measures, in the common k-dimensional space, between the given noun phrase and the one or more candidate antecedent noun phrases based on inner products of the coreference embeddings in the common k-dimensional space; determine, for a candidate noun phrase of the candidate antecedent noun phrases, a score for the candidate noun phrase as an antecedent for the given noun phrase based on the distance measure between the given noun phrase and the candidate noun phrase; select the candidate noun phrase as the antecedent for the given noun phrase based on the determined score; modify the search query issued by the client device, wherein modifying the search query comprises replacing the given noun phrase with the selected candidate noun phrase in response to selecting the candidate noun phrase as the antecedent for the given noun phrase; and provide, in response to the search query issued by the client device, search results that are responsive to a modified query that replaces the given noun phrase with the selected candidate noun phrase. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A non-transitory computer readable storage medium storing computer instructions executable by a processor, including instructions that are useful for modifying a search query issued by a client device and that are to:
-
identify distributed word representations for one or more noun phrases, the distributed word representations indicative of syntactic and semantic features of the one or more noun phrases; determine, for each of the one or more noun phrases and based on labeled data, at least one training pair of a referring feature representation and an antecedent feature representation, wherein; the referring feature representation for the at least one training pair for a given noun phrase of the one or more noun phrases includes the distributed word representation for the given noun phrase, and the antecedent feature representation for the at least one training pair for the given noun phrase includes the distributed word representation for the given noun phrase augmented by one or more antecedent features, wherein the one or more antecedent features include a parse tree distance for the given noun phrase as a candidate antecedent noun phrase in the labeled data, the parse tree distance being a parse tree based distance between the given noun phrase as the candidate antecedent noun phrase and a corresponding referring noun phrase; wherein the referring feature representations are m-dimensional space vectors, the antecedent feature representations are n-dimensional space vectors, and wherein the m-dimensional space vectors vary in length from the n-dimensional space vectors; learn coreference embeddings of the referring and antecedent feature representations of the one or more noun phrases based on iteratively embedding the m-dimensional space vectors and the n-dimensional space vectors into a common k-dimensional space; identify, after the learning of the coreference embeddings, a first text segment and a second text segment associated with the first text segment, wherein the second text segment is a search query issued by a client device of a user; identify, in the first text segment, an occurrence of one or more candidate antecedent noun phrases; identify, in the second text segment, an occurrence of the given noun phrase; determine, for the given noun phrase, distance measures, in the common k-dimensional space, between the given noun phrase and the one or more candidate antecedent noun phrases based on inner products of the coreference embeddings in the common k-dimensional space; determine, for a candidate noun phrase of the candidate antecedent noun phrases, a score for the candidate noun phrase as an antecedent for the given noun phrase based on the distance measure between the given noun phrase and the candidate noun phrase; select the candidate noun phrase as the antecedent for the given noun phrase based on the determined score; modify the search query issued by the client device, wherein modifying the search query comprises replacing the given noun phrase with the selected candidate noun phrase in response to selecting the candidate noun phrase as the antecedent for the given noun phrase; and provide, in response to the search query issued by the client device, search results that are responsive to a modified query that replaces the given noun phrase with the selected candidate noun phrase. - View Dependent Claims (18)
-
Specification