SEMI-SUPERVISED LEARNING OF WORD EMBEDDINGS
First Claim
1. A method comprising:
- receiving, by one or more processors, a set of natural language text;
generating, by one or more processors, a set of first metadata for the set of natural language text, where the first metadata is generated using supervised learning method(s);
generating, by one or more processors, a set of second metadata for the set of natural language text, where the second metadata is generated using unsupervised learning method(s);
training, by one or more processors, an artificial neural network adapted to generate vector representations for natural language text, where the training is based, at least in part, on the received natural language text, the generated set of first metadata, and the generated set of second metadatagenerating, by one or more processors, a set of at least two vector representations for the set of natural language text using the trained artificial neural network, where each vector representation of the set of at least two vector representations pertains to a respective subset of natural language text from the set of natural language text;
generating, by one or more processors, a set of third metadata for the generated set of at least two vector representations, where the third metadata is generated using supervised learning method(s);
generating, by one or more processors, a set of fourth metadata for the set of at least two vector representations, where the fourth metadata is generated using unsupervised learning method(s);
training, by one or more processors, the artificial neural network based, at least in part, on the generated set of at least two vector representations, the generated set of third metadata for the set of at least two vector representations, and the generated set of fourth metadata for the set of at least two vector representations; and
storing, by one or more processors, one or more vector representations generated using the trained artificial neural network for use by a natural language processing system.
1 Assignment
0 Petitions
Accused Products
Abstract
Software that trains an artificial neural network for generating vector representations for natural language text, by performing the following steps: (i) receiving, by one or more processors, a set of natural language text; (ii) generating, by one or more processors, a set of first metadata for the set of natural language text, where the first metadata is generated using supervised learning method(s); (iii) generating, by one or more processors, a set of second metadata for the set of natural language text, where the second metadata is generated using unsupervised learning method(s); and (iv) training, by one or more processors, an artificial neural network adapted to generate vector representations for natural language text, where the training is based, at least in part, on the received natural language text, the generated set of first metadata, and the generated set of second metadata.
-
Citations
20 Claims
-
1. A method comprising:
-
receiving, by one or more processors, a set of natural language text; generating, by one or more processors, a set of first metadata for the set of natural language text, where the first metadata is generated using supervised learning method(s); generating, by one or more processors, a set of second metadata for the set of natural language text, where the second metadata is generated using unsupervised learning method(s); training, by one or more processors, an artificial neural network adapted to generate vector representations for natural language text, where the training is based, at least in part, on the received natural language text, the generated set of first metadata, and the generated set of second metadata generating, by one or more processors, a set of at least two vector representations for the set of natural language text using the trained artificial neural network, where each vector representation of the set of at least two vector representations pertains to a respective subset of natural language text from the set of natural language text; generating, by one or more processors, a set of third metadata for the generated set of at least two vector representations, where the third metadata is generated using supervised learning method(s); generating, by one or more processors, a set of fourth metadata for the set of at least two vector representations, where the fourth metadata is generated using unsupervised learning method(s); training, by one or more processors, the artificial neural network based, at least in part, on the generated set of at least two vector representations, the generated set of third metadata for the set of at least two vector representations, and the generated set of fourth metadata for the set of at least two vector representations; and storing, by one or more processors, one or more vector representations generated using the trained artificial neural network for use by a natural language processing system. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer program product comprising a computer readable storage medium having stored thereon:
-
program instructions programmed to receive a set of natural language text; program instructions programmed to generate a set of first metadata for the set of natural language text, where the first metadata is generated using supervised learning method(s); program instructions programmed to generate a set of second metadata for the set of natural language text, where the second metadata is generated using unsupervised learning method(s); program instructions programmed to train an artificial neural network adapted to generate vector representations for natural language text, where the training is based, at least in part, on the received natural language text, the generated set of first metadata, and the generated set of second metadata program instructions programmed to generate a set of at least two vector representations for the set of natural language text using the trained artificial neural network, where each vector representation of the set of at least two vector representations pertains to a respective subset of natural language text from the set of natural language text; program instructions programmed to generate a set of third metadata for the generated set of at least two vector representations, where the third metadata is generated using supervised learning method(s); program instructions programmed to generate a set of fourth metadata for the set of at least two vector representations, where the fourth metadata is generated using unsupervised learning method(s); program instructions programmed to train the artificial neural network based, at least in part, on the generated set of at least two vector representations, the generated set of third metadata for the set of at least two vector representations, and the generated set of fourth metadata for the set of at least two vector representations; and program instructions programmed to store one or more vector representations generated using the trained artificial neural network for use by a natural language processing system. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer system comprising:
-
a processor(s) set; and a computer readable storage medium; wherein; the processor set is structured, located, connected and/or programmed to run program instructions stored on the computer readable storage medium; and the program instructions include; program instructions programmed to receive a set of natural language text; program instructions programmed to generate a set of first metadata for the set of natural language text, where the first metadata is generated using supervised learning method(s); program instructions programmed to generate a set of second metadata for the set of natural language text, where the second metadata is generated using unsupervised learning method(s); program instructions programmed to train an artificial neural network adapted to generate vector representations for natural language text, where the training is based, at least in part, on the received natural language text, the generated set of first metadata, and the generated set of second metadata program instructions programmed to generate a set of at least two vector representations for the set of natural language text using the trained artificial neural network, where each vector representation of the set of at least two vector representations pertains to a respective subset of natural language text from the set of natural language text; program instructions programmed to generate a set of third metadata for the generated set of at least two vector representations, where the third metadata is generated using supervised learning method(s); program instructions programmed to generate a set of fourth metadata for the set of at least two vector representations, where the fourth metadata is generated using unsupervised learning method(s); program instructions programmed to train the artificial neural network based, at least in part, on the generated set of at least two vector representations, the generated set of third metadata for the set of at least two vector representations, and the generated set of fourth metadata for the set of at least two vector representations; and program instructions programmed to store one or more vector representations generated using the trained artificial neural network for use by a natural language processing system. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification