Generating distributed word embeddings using structured information
First Claim
1. A method for generating a vector representation of a set of natural language text in a natural language processing system, the method comprising:
- receiving, by the natural language processing system, a first set of natural language text and a set of information pertaining to the first set of natural language text, where the information includes a dependency parse tree including a root node and a plurality of nodes that depend from the root node, where the root node represents the first set of natural language text, and where the plurality of nodes that depend from the root node represent context features of the first set of natural language text;
generating, by the natural language processing system, a first vector representation of the first set of natural language text, wherein the generating includes adding vector representations for the context features represented by the plurality of nodes that depend from the root node; and
comparing, by the natural language processing system, the generated first vector representation to a second vector representation to determine, in the natural language processing system, an amount of similarity between the first set of natural language text and a second set of natural language text represented by the second vector representation.
1 Assignment
0 Petitions
Accused Products
Abstract
A computer program that generates a vector representation of a set of natural language text in a natural language processing system by: (i) receiving a first set of natural language text and a set of information pertaining to the first set of natural language text, where the information includes a dependency parse tree including a root node and a plurality of nodes that depend from the root node, where the root node represents the first set of natural language text, and where the plurality of nodes that depend from the root node represent context features of the first set of natural language text; and (ii) generating, by the natural language processing system, a first vector representation of the first set of natural language text, wherein the generating includes adding vector representations for the context features represented by the plurality of nodes that depend from the root node.
-
Citations
20 Claims
-
1. A method for generating a vector representation of a set of natural language text in a natural language processing system, the method comprising:
-
receiving, by the natural language processing system, a first set of natural language text and a set of information pertaining to the first set of natural language text, where the information includes a dependency parse tree including a root node and a plurality of nodes that depend from the root node, where the root node represents the first set of natural language text, and where the plurality of nodes that depend from the root node represent context features of the first set of natural language text; generating, by the natural language processing system, a first vector representation of the first set of natural language text, wherein the generating includes adding vector representations for the context features represented by the plurality of nodes that depend from the root node; and comparing, by the natural language processing system, the generated first vector representation to a second vector representation to determine, in the natural language processing system, an amount of similarity between the first set of natural language text and a second set of natural language text represented by the second vector representation. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer program product for generating a vector representation of a set of natural language text in a natural language processing system, the computer program product comprising a computer readable storage medium having stored thereon:
-
program instructions to receive, by the natural language processing system, a first set of natural language text and a set of information pertaining to the first set of natural language text, where the information includes a dependency parse tree including a root node and a plurality of nodes that depend from the root node, where the root node represents the first set of natural language text, and where the plurality of nodes that depend from the root node represent context features of the first set of natural language text; program instructions to generate, by the natural language processing system, a first vector representation of the first set of natural language text, wherein the generating includes adding vector representations for the context features represented by the plurality of nodes that depend from the root node; and program instructions to compare, by the natural language processing system, the generated first vector representation to a second vector representation to determine, in the natural language processing system, an amount of similarity between the first set of natural language text and a second set of natural language text represented by the second vector representation. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer system for generating a vector representation of a set of natural language text in a natural language processing system, the computer system comprising:
-
a processor(s) set; and a computer readable storage medium; wherein; the processor set is structured, located, connected and/or programmed to run program instructions stored on the computer readable storage medium; and the program instructions include; program instructions to receive, by the natural language processing system, a first set of natural language text and a set of information pertaining to the first set of natural language text, where the information includes a dependency parse tree including a root node and a plurality of nodes that depend from the root node, where the root node represents the first set of natural language text, and where the plurality of nodes that depend from the root node represent context features of the first set of natural language text; program instructions to generate, by the natural language processing system, a first vector representation of the first set of natural language text, wherein the generating includes adding vector representations for the context features represented by the plurality of nodes that depend from the root node; and program instructions to compare, by the natural language processing system, the generated first vector representation to a second vector representation to determine, in the natural language processing system, an amount of similarity between the first set of natural language text and a second set of natural language text represented by the second vector representation. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification