SYSTEM AND METHOD FOR AUTOMATIC, UNSUPERVISED PARAPHRASE GENERATION USING A NOVEL FRAMEWORK THAT LEARNS SYNTACTIC CONSTRUCT WHILE RETAINING SEMANTIC MEANING
First Claim
1. A method, comprising:
- receiving, by a surface realizer of a paraphrase generation system, a plurality of bidirectional disjunctive logical forms, wherein the plurality of bidirectional disjunctive logical forms includes two directional disjunctions of differences between a first logical form of a first sentence and a second logical form of a second sentence;
realizing, by the surface realizer, the plurality of bidirectional disjunctive logical forms to generate first paraphrases of the first and second sentences;
determining a first score for a third paraphrase of the first paraphrases;
determining a second score for a fourth paraphrase of the first paraphrases, wherein the first score is higher than the second score based in part on a first syntactic variation between the third paraphrase and the first sentence and the second sentence being greater than a second syntactic variation between the fourth paraphrase and the first sentence and the second sentence; and
pruning, by the paraphrase generation system, the first paraphrases to generate second paraphrases, wherein the first paraphrases are pruned based on the first score and the second score such that the third paraphrase is included in the second paraphrases and the fourth paraphrase is not included in the second paraphrases based on the first score being higher than the second score.
1 Assignment
0 Petitions
Accused Products
Abstract
A system includes a question answering system executed by a computer, a processor, and a memory coupled to the processor. The memory is encoded with instructions that when executed cause the processor to provide training for training the question answering system. The training system is configured to receive a plurality of bidirectional disjunctive logical forms which include two directional disjunctions of differences between a first logical form of a first sentence and second logical form of a second sentence, realize the plurality of bidirectional disjunctive logical forms to generate a first plurality of paraphrases of the first and second sentence, score each of the first plurality of paraphrases based on textual similarity between the first plurality of paraphrases and the first and second sentences, and prune the first plurality of paraphrases to generate a second plurality of paraphrases based on the scores of each of the first plurality of paraphrases.
17 Citations
20 Claims
-
1. A method, comprising:
-
receiving, by a surface realizer of a paraphrase generation system, a plurality of bidirectional disjunctive logical forms, wherein the plurality of bidirectional disjunctive logical forms includes two directional disjunctions of differences between a first logical form of a first sentence and a second logical form of a second sentence; realizing, by the surface realizer, the plurality of bidirectional disjunctive logical forms to generate first paraphrases of the first and second sentences; determining a first score for a third paraphrase of the first paraphrases; determining a second score for a fourth paraphrase of the first paraphrases, wherein the first score is higher than the second score based in part on a first syntactic variation between the third paraphrase and the first sentence and the second sentence being greater than a second syntactic variation between the fourth paraphrase and the first sentence and the second sentence; and pruning, by the paraphrase generation system, the first paraphrases to generate second paraphrases, wherein the first paraphrases are pruned based on the first score and the second score such that the third paraphrase is included in the second paraphrases and the fourth paraphrase is not included in the second paraphrases based on the first score being higher than the second score. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system, comprising:
-
a processor; and a memory coupled to the processor, the memory encoded with instructions that when executed by the processor cause the processor to; receive a plurality of bidirectional disjunctive logical forms, wherein the plurality of bidirectional disjunctive logical forms includes two directional disjunctions of differences between a first logical form of a first sentence and a second logical form of a second sentence; realize the plurality of bidirectional disjunctive logical forms to generate first paraphrases of the first and second sentences; determine a first score for a third paraphrase of the first paraphrases; determine a second score for a fourth paraphrase of the first paraphrases, wherein the first score is higher than the second score based in part on a first syntactic variation between the third paraphrase and the first sentence and the second sentence being greater than a second syntactic variation between the fourth paraphrase and the first sentence and the second sentence; and prune the first paraphrases to generate second paraphrases, wherein the first paraphrases are pruned based on the first score and the second score such that the third paraphrase is included in the second paraphrases and the fourth paraphrase is not included in the second paraphrases based on the first score being higher than the second score. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer program product for generating paraphrases, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a computer to cause the computer to:
-
receive a plurality of bidirectional disjunctive logical forms, wherein the plurality of bidirectional disjunctive logical forms includes two directional disjunctions of differences between a first logical form of a first sentence and a second logical form of a second sentence; realize the plurality of bidirectional disjunctive logical forms to generate a first plurality of paraphrases of the first and second sentences; determine a first score for a third paraphrase of the first paraphrases; determine a second score for a fourth paraphrase of the first paraphrases, wherein the first score is higher than the second score based in part on a first syntactic variation between the third paraphrase and the first sentence and the second sentence being greater than a second syntactic variation between the fourth paraphrase and the first sentence and the second sentence; and prune the first paraphrases to generate second paraphrases, wherein the first paraphrases are pruned based on the first score and the second score such that the third paraphrase is included in the second paraphrases and the fourth paraphrase is not included in the second paraphrases based on the first score being higher than the second score. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification