LANGUAGE INDEPENDENT REPRESENTATIONS
First Claim
1. A method for computing a language independent representation of a snippet comprising:
- receiving the snippet;
building a dependency structure based on the received snippet, the dependency structure comprising multiple nodes;
obtaining semantic and syntactic representations of leaf nodes of the dependency structure; and
generating a semantic representation corresponding to a selected non-leaf node of the dependency structure by applying a semantic model to semantic and syntactic representations of parent nodes of the selected non-leaf node.
2 Assignments
0 Petitions
Accused Products
Abstract
Snippets can be represented in a language-independent semantic manner. Each portion of a snippet can be represented by a combination of a semantic representation and a syntactic representation, each in its own dimensional space. A snippet can be divided into portions by constructing a dependency structure based on relationships between words and phrases. Leaf nodes of the dependency structure can be assigned: A) a semantic representation according to pre-defined word mappings and B) a syntactic representation according to the grammatical use of the word. A trained semantic model can assign to each non-leaf node of the dependency structure a semantic representation based on a combination of the semantic and syntactic representations of the corresponding lower-level nodes. A trained syntactic model can assign to each non-leaf node a syntactic representation based on a combination of the syntactic representations of the corresponding lower-level nodes and the semantic representation of that node.
-
Citations
20 Claims
-
1. A method for computing a language independent representation of a snippet comprising:
-
receiving the snippet; building a dependency structure based on the received snippet, the dependency structure comprising multiple nodes; obtaining semantic and syntactic representations of leaf nodes of the dependency structure; and generating a semantic representation corresponding to a selected non-leaf node of the dependency structure by applying a semantic model to semantic and syntactic representations of parent nodes of the selected non-leaf node.
-
-
2. The method of claim 1 wherein:
-
the syntactic representations are syntactic vectors; and the semantic representations are semantic vectors.
-
-
3. The method of claim 2 wherein the semantic model comprises:
-
a tensor function that generates a tensor based on two syntactic vectors; a first matrix function that generates a first matrix based on two syntactic vectors; a second matrix function that generates a second matrix based on two syntactic vectors; and an offset function that generates an offset vector based on two syntactic vectors.
-
-
4. The method of claim 3 wherein generating the semantic representation corresponding to the selected non-leaf node of the dependency structure comprises:
-
obtaining a first syntactic vector corresponding to a first parent node of the selected non-leaf node; obtaining a second syntactic vector corresponding to a second parent node of the selected non-leaf node; generating the tensor by applying the tensor function to the first syntactic vector and the second syntactic vector; generating the first matrix by applying the first matrix function to the first syntactic vector and the second syntactic vector; generating the second matrix by applying the second matrix function to the first syntactic vector and the second syntactic vector; and generating the offset vector by applying the offset function to the first syntactic vector and the second syntactic vector.
-
-
5. The method of claim 4 wherein generating the semantic representation corresponding to the selected non-leaf node of the dependency structure comprises:
-
obtaining a first semantic vector corresponding to a first parent node of the selected non-leaf node; obtaining a second semantic vector corresponding to a second parent node of the selected non-leaf node; computing a first result by multiplying together;
the tensor, the first semantic vector, and the second semantic vector;computing a second result by multiplying together;
the first matrix with the first semantic vector;computing a third result by multiplying together;
the second matrix with the second semantic vector; andcomputing the semantic representation corresponding to a selected non-leaf node as the sum of;
the first result, the second result, the third result, and the offset vector.
-
-
6. The method of claim 2 further comprising:
generating a syntactic representation corresponding to the selected non-leaf node of the dependency structure by applying a syntactic model to syntactic representations of the parent nodes of the selected non-leaf node and to the semantic representation corresponding to the selected non-leaf node.
-
7. The method of claim 6 wherein the syntactic model comprises:
-
a tensor function that generates a tensor based on two syntactic vectors; a first matrix function that generates a first matrix based on two syntactic vectors; a second matrix function that generates a second matrix based on two syntactic vectors; an offset function that generates an offset vector based on two syntactic vectors; and a mapping matrix that is a linear mapping from semantic space to syntactic space.
-
-
8. The method of claim 7 wherein generating the syntactic representation corresponding to the selected non-leaf node comprises:
-
obtaining a first syntactic vector corresponding to a first parent node of the selected non-leaf node; obtaining a second syntactic vector corresponding to a second parent node of the selected non-leaf node; generating the tensor by applying the tensor function to the first syntactic vector and the second syntactic vector; generating the first matrix by applying the first matrix function to the first syntactic vector and the second syntactic vector; generating the second matrix by applying the second matrix function to the first syntactic vector and the second syntactic vector; and generating the offset vector by applying the offset function to the first syntactic vector and the second syntactic vector.
-
-
9. The method of claim 8 wherein generating the syntactic representation corresponding to the selected non-leaf node of the dependency structure comprises:
-
computing a first result by multiplying together;
the first matrix with the first semantic vector;computing a second result by multiplying together;
the second matrix with the second semantic vector;computing a third result by multiplying together;
the mapping matrix with the semantic representation corresponding to the selected non-leaf node; andcomputing the syntactic representation corresponding to the selected non-leaf node as the sum of;
the first result, the second result, the third result, the tensor, and the offset vector.
-
-
10. The method of claim 1 wherein the dependency structure is a binary tree structure.
-
11. The method of claim 1 wherein the selected non-leaf node of the dependency structure is the root node of the dependency structure, and wherein the method further comprises:
-
generating, for a composition that includes the semantic representation corresponding to the selected non-leaf node of the dependency structure, a score; and adjusting parameters of the semantic model based on the score.
-
-
12. The method of claim 11 wherein generating the score comprises:
applying, to the semantic representation corresponding the root node and to a syntactic representation corresponding the root node, a scoring neural network that is trained to receive a semantic vector and a syntactic vector and generate a score indicating how reliably the semantic vector maps into a language independent space.
-
13. The method of claim 12 wherein generating the score further comprises:
-
applying the scoring neural network to multiple nodes of the dependency structure to compute corresponding scores for the multiple nodes of the dependency structure; and combining, as the score for the composition, the scores for the multiple nodes of the dependency structure.
-
-
14. The method of claim 13 wherein combining the scores for the multiple nodes of the dependency structure comprises:
-
summing the scores for the multiple nodes for the dependency structure;
ormultiplying each selected score of the multiple nodes for the dependency structure by (1/2)̂
depth, wherein the depth is the maximum number of edges between the node corresponding to that selected score and the root node of the dependency structure, and summing the results of the multiplications.
-
-
15. The method of claim 1 wherein:
-
at least two of the multiple nodes are connected by edges; and each node of the dependency structure corresponds to one or more words of the received snippet and each set of edges of the dependency structure between a parent node and one or more child nodes corresponds to a relationship between the child nodes that are used to create the parent node.
-
-
16. A computer-readable storage medium storing instructions that, when executed by a computing system, cause the computing system to perform operations for generating a translation of a snippet, the operations comprising:
-
receiving the snippet with a source language; receiving an indication of a desired output language to translation the snippet into; building a dependency structure based on the received snippet, the dependency structure comprising multiple nodes; obtaining semantic and syntactic representations of leaf nodes of the dependency structure; generating a semantic representation corresponding to a selected non-leaf node of the dependency structure by applying a semantic model to semantic and syntactic representations of parent nodes of the selected non-leaf node; generating a syntactic representation corresponding to the selected non-leaf node of the dependency structure by applying a syntactic model to syntactic representations of the parent nodes of the selected non-leaf node and to the semantic representation corresponding to the selected non-leaf node; generating a semantic representation corresponding to a root node of the dependency structure by applying the semantic model to semantic and syntactic representations of parent nodes of the root node; mapping the semantic representation corresponding to the root node into a language independent space that has had semantic representations corresponding to snippets in the output language mapped into it; selecting an identified semantic representation corresponding to a snippet in the output language that was mapped into the language independent space and that has a smallest distance in the language independent space to the semantic representation corresponding to the root node; and providing, as the translation of the snippet, the snippet in the output language corresponding to the identified semantic representation.
-
-
17. The computer-readable storage medium of claim 16 wherein the operations further comprise:
computing that the distance between the identified semantic representation and the semantic representation corresponding to the root node is less than a threshold distance.
-
18. A system for computing a language independent representation of a snippet comprising:
-
one or more processors; a memory; an interface configured to receive the snippet; a dependency structure builder configured to build a dependency structure based on the received snippet, the dependency structure comprising multiple nodes; and a semantic and syntactic model applier configured to; obtain semantic and syntactic representations of leaf nodes of the dependency structure; and generate a semantic representation corresponding to a selected non-leaf node of the dependency structure by applying a semantic model to semantic and syntactic representations of parent nodes of the selected non-leaf node.
-
-
19. The system of claim 18, wherein the multiple nodes of the dependency structure comprise:
-
one or more leaf nodes each corresponding to one or more words of the snippet; one or more intermediate nodes based on one or more of;
the one or more leaf nodes of the dependency structure for the selected snippet or one or more other intermediate nodes of the dependency structure for the selected snippet; anda root node based on at least one of the one or more intermediate nodes of the dependency structure for the selected snippet.
-
-
20. The system of claim 19, wherein the dependency structure builder is configured to build the dependency structure by:
-
dividing the snippet into word groups; creating a leaf node corresponding to each word group; and until the root node is added corresponding to a word group comprising all words of the selected snippet; selecting two or more nodes from the dependency structure as combine nodes wherein the combine nodes are nodes that have not been combined with any higher level node and that have a determined relationship; and creating a new node at a level one level higher the selected combine node with a highest level, wherein the new node corresponds to a combination of the word groups corresponding to the selected combine nodes, and wherein the new node is connected by edges to the selected combine nodes.
-
Specification