Method for text processing
First Claim
1. A computer-implemented method for text processing, the method being executable at a computing device, the method comprising:
- at a training phase;
acquiring one or more source phrases, each of the source phrase comprising a first set of sequential words, each word of the first set of sequential words being a source word;
acquiring one or more target phrases, each of the target phrase being in a same language as the source phrases, each of the target phrase comprising a second set of sequential words being at least partially different from the first set of sequential words of a respective source phrase, each word of the second set of sequential words being a target word;
associating, for a given source phrase, a respective source word feature set with each one of the source words, the respective source word feature set for a given source word comprising;
one or more grammatical features of the given source word; and
a meaning of the given source word;
associating, for a respective target phrase, a respective target word feature set with each one of the target words, the respective target word feature set for a given target word comprising;
one or more grammatical features of the given target word; and
a meaning of the given target word;
analyzing the respective source word feature set of each source words of the given source phrase and the respective target word feature set of each target words of the respective target phrase;
mapping the given source word of the given source phrase to a corresponding target word of the respective target phrase based on a similarity of the source word feature set of the given source word to the target word feature set of the corresponding target word;
based on the mapping, generating one or more phrase transformation rules applicable to the given source phrase to transform the first set of sequential words into the second set of sequential words of the respective target phrase;
storing the one or more source phrases and the associated one or more generated phrase transformation rules in a memory of the computing device;
at an in-use phase;
acquiring a text phrase, the text phrase comprising a third set of sequential words being at least partially different from the first set of sequential words; and
retrieving the one or more source phrases from the memory;
performing at least one of a grammatical analysis and a semantic analysis of the text phrase and the one or more stored source phrases, to determine similarity of the text phrase to the one or more stored source phrases;
upon determining that the text phrase has the similarity to the given stored source phrase greater than a threshold, applying the associated one or more phrase transformation rules to the text phrase to generate a transformed text phrase, the transformed text phrase comprising a fourth set of sequential words being at least partially similar to the second set of sequential words of the respective target phrase.
4 Assignments
0 Petitions
Accused Products
Abstract
Method for text processing executable at a computing device, comprising appreciating a source phrase comprised of source words; appreciating a target phrase comprised of target words; associating a respective source word feature set with each one of the source words; associating a respective target word feature set with each one of the target words; analyzing source word feature sets and target word feature sets; and based on the analysis, generating one or more phrase transformation rules for transforming the source phrase into the target phrase. Also a server and non-transitory computer-readable medium storing program instructions for carrying out the method.
-
Citations
14 Claims
-
1. A computer-implemented method for text processing, the method being executable at a computing device, the method comprising:
-
at a training phase; acquiring one or more source phrases, each of the source phrase comprising a first set of sequential words, each word of the first set of sequential words being a source word; acquiring one or more target phrases, each of the target phrase being in a same language as the source phrases, each of the target phrase comprising a second set of sequential words being at least partially different from the first set of sequential words of a respective source phrase, each word of the second set of sequential words being a target word; associating, for a given source phrase, a respective source word feature set with each one of the source words, the respective source word feature set for a given source word comprising; one or more grammatical features of the given source word; and a meaning of the given source word; associating, for a respective target phrase, a respective target word feature set with each one of the target words, the respective target word feature set for a given target word comprising; one or more grammatical features of the given target word; and a meaning of the given target word; analyzing the respective source word feature set of each source words of the given source phrase and the respective target word feature set of each target words of the respective target phrase; mapping the given source word of the given source phrase to a corresponding target word of the respective target phrase based on a similarity of the source word feature set of the given source word to the target word feature set of the corresponding target word; based on the mapping, generating one or more phrase transformation rules applicable to the given source phrase to transform the first set of sequential words into the second set of sequential words of the respective target phrase; storing the one or more source phrases and the associated one or more generated phrase transformation rules in a memory of the computing device; at an in-use phase; acquiring a text phrase, the text phrase comprising a third set of sequential words being at least partially different from the first set of sequential words; and retrieving the one or more source phrases from the memory; performing at least one of a grammatical analysis and a semantic analysis of the text phrase and the one or more stored source phrases, to determine similarity of the text phrase to the one or more stored source phrases; upon determining that the text phrase has the similarity to the given stored source phrase greater than a threshold, applying the associated one or more phrase transformation rules to the text phrase to generate a transformed text phrase, the transformed text phrase comprising a fourth set of sequential words being at least partially similar to the second set of sequential words of the respective target phrase. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method for text processing, the method being executable at a computing device comprising a memory, the memory storing one or more source phrases, and one or more phrase transformation rules associated with each of the one or more source phrases, the one or more phrase transformation rules having been generated based on an analysis of feature sets including (i) a source word feature set associated with each source word comprising a first set of sequential words forming a given source phrase, the source word feature set for a given source word comprising one or more grammatical features of the given source word, and a meaning of the given source word, and (ii) a target word feature set associated with each target word comprising a second set of sequential words forming a respective target phrase, the target word feature set for a given target word comprising one or more grammatical features of the given target word, and a meaning of the given target word, the target phrase being in a same language as the source phrase, the analysis comprising mapping the given source word to a corresponding target word based on a similarity of the source word feature set of the given source word to the target word feature set of the corresponding target word, the method comprising:
-
acquiring a text phrase, the text phrase comprising a third set of sequential words being at least partially different from the first set of sequential words; retrieving the one or more source phrases; performing at least one of a grammatical analysis and a semantic analysis of the text phrase and the one or more stored source phrases to determine similarity of the text phrase to the one or more stored source phrases; upon determining that the similarity of the text phrase to the given source phrase is greater than a threshold applying the associated one or more phrase transformation rules to the text phrase to generate a transformed text phrase, the transformed text phrase being at least partially similar to the second set of sequential words of the respective target phrase.
-
-
8. A non-transitory computer-readable medium storing program instructions for text processing, the program instructions being executable by a computing device to effect:
-
at a training phase; acquisition of one or more source phrases, each of the source phrase comprising a first set of sequential words, each word of the first set of sequential words being a source word; acquisition of one or more target phrases, each of the target phrase being in a same language as the source phrases, each of the target phrase comprising of a second set of sequential words being at least partially different from the first set of sequential words of a respective source phrase, each word of the second set of sequential words being a target word; association, for a given source phrase, of a respective source word feature set with each one of the source words, the respective source word feature set for a given source word comprising; one or more grammatical features of the given source word; and a meaning of the given source word; association, for a respective target phrase, of a respective target word feature set with each one of the target words, the respective source word feature set for a given target word comprising; one or more grammatical features of the given target word; and a meaning of the given target word; analysis of the respective source word feature set of each source words of the given source phrase and the respective target word feature set of each target words of the respective target phrase; map the given source word of the given source phrase to a corresponding target word of the respective target phrase based on a similarity of the source word feature set of the given source word to the target word feature set of the corresponding target word; based on the mapping, generation of one or more phrase transformation rules applicable to the given source phrase to transform the first set of sequential words into the second set of sequential words of the respective target phrase; storing the one or more source phrases and the associated one or more generated phrase transformation rules in a memory of the computing device; at an in-use phase; acquisition of a text phrase, the text phrase comprising a third set of sequential words being at least partially different from the first set of sequential words; retrieving the one or more source phrases from the memory; perform at least one of a grammatical analysis and a semantic analysis of the text phrase and the one or more stored source phrases to determine similarity of the text phrase to the one or more stored source phrases; upon determining that the text phrase has the similarity to the given source phrase greater than a threshold, applying the associated one or more phrase transformation rules to the text phrase to generate a transformed text phrase, the transformed text phrase comprising a fourth set of sequential words being at least partially similar to the second set of sequential words of the respective target phrase. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A non-transitory computer-readable medium storing one or more source phrases, and one or more phrase transformation rules associated with each of the one or more source phrases, and program instructions, the one or more phrase transformation rules having been generated based on an analysis of feature sets including (i) a source word feature set associated with each source word comprising a first set of sequential words forming a given source phrase, the source word feature for a given source word comprising one or more grammatical features of the given source word and a meaning of the given source word, and (ii) a target word feature set associated with each target word comprising a second set of sequential words forming a respective target phrase, the target word feature for a given target word comprising one or more grammatical features of the given target word and a meaning of the given target word, the target phrase being in a same language as the source phrase, the analysis comprising mapping the given source word to a corresponding target word based on a similarity of the source word feature set of the given source word to the target word feature set of the corresponding target word, the program instructions executable by a computing device to effect:
-
acquisition of a text phrase, the text phrase comprising a third set of sequential words being at least partially different from the first set of sequential words; retrieve the one or more source phrases; perform at least one of a grammatical analysis and a semantic analysis of the text phrase and the one or more stored source phrases to determine similarity of the text phrase to the one or more stored source phrases; upon determination that the similarity of the text phrase to the given source phrase is greater than a threshold, applying of the associated one or more phrase transformation rules to the text phrase to generate a transformed text phrase, the transformed text phrase being at least partially similar to the second set of sequential words of the respective target phrase.
-
Specification