Principled Approach to Paraphrasing
First Claim
1. A method for automatic paraphrasing, the method comprising:
- selecting a plurality of atomic linguistic elements from an input text, the plurality of atomic linguistic elements including at least one atomic linguistic element kind selected from a word, a phrase, a pattern and a lexical dependency tree;
identifying a plurality of candidate atomic paraphrasing pairs each having one of the plurality of atomic linguistic elements and an atomic paraphrasing element;
selecting a combination of candidate atomic paraphrasing pairs; and
constructing a paraphrasing text of the input text using the atomic paraphrasing elements in the selected combination of candidate atomic paraphrasing pairs.
2 Assignments
0 Petitions
Accused Products
Abstract
A principled approach to paraphrasing analyzes input text and paraphrases at atomic linguistic level, instead of analyzing the input text and paraphrases as a whole set at one time. The principled approach extracts atomic linguistic elements from the input text and identifies matching atomic paraphrasing elements to form candidate atomic paraphrasing pairs. A variety of atomic transformation types are identified to form atomic paraphrasing pairs. The candidate atomic paraphrasing pairs are evaluated using feature functions and a probability model. The principled approach scores a combination of multiple candidate atomic paraphrasing pairs using a score function which derives its value from the feature functions of the candidate atomic paraphrasing pairs. A combination which has a high score may be used for constructing a paraphrasing text.
-
Citations
20 Claims
-
1. A method for automatic paraphrasing, the method comprising:
-
selecting a plurality of atomic linguistic elements from an input text, the plurality of atomic linguistic elements including at least one atomic linguistic element kind selected from a word, a phrase, a pattern and a lexical dependency tree; identifying a plurality of candidate atomic paraphrasing pairs each having one of the plurality of atomic linguistic elements and an atomic paraphrasing element; selecting a combination of candidate atomic paraphrasing pairs; and constructing a paraphrasing text of the input text using the atomic paraphrasing elements in the selected combination of candidate atomic paraphrasing pairs. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method for automatic paraphrasing, the method comprising:
-
selecting a plurality of atomic linguistic elements from an input text, the plurality of atomic linguistic elements including at least one linguistic element kind selected from a word, a phrase, a pattern and a lexical dependency tree; for each atomic linguistic element, selecting at least one atomic paraphrasing element, wherein the atomic linguistic element relates to the selected at least one atomic paraphrasing element through an atomic transformation to form a candidate atomic paraphrasing pair; obtaining a probability value of each candidate atomic paraphrasing pair; computing a composite paraphrasing score of a combination of candidate atomic paraphrasing pairs based on the probability values of the candidate atomic paraphrasing pairs; selecting the combination of candidate atomic paraphrasing pairs if the respective composite paraphrasing score satisfies a preset condition; and constructing a paraphrasing text using the atomic paraphrasing elements in the selected combination of candidate atomic paraphrasing pairs. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18)
-
-
19. One or more computer readable media having stored thereupon a plurality of instructions that, when executed by a processor, causes the processor to:
-
select a plurality of atomic linguistic elements from an input text, the plurality of atomic linguistic elements including at least one atomic linguistic element kind selected from a word, a phrase, a pattern and a lexical dependency tree; identify a plurality of candidate atomic paraphrasing pairs each having one of the plurality of atomic linguistic elements and an atomic paraphrasing element; select a combination of candidate atomic paraphrasing pairs; and construct a paraphrasing text of the input text using the atomic paraphrasing elements in the selected combination of candidate atomic paraphrasing pairs. - View Dependent Claims (20)
-
Specification