Method, non-transitory computer-readable recording medium storing a program, apparatus, and system for creating similar sentence from original sentences to be translated
First Claim
1. A method of creating similar sentences from an original sentence to be translated, the method comprising:
- accepting, by a processor, a first sentence including a first phrase;
extracting, from a first database by the processor, one or more second phrases having the same meaning as the first phrase, the first phrase being part of a plurality of phrases constituting the first sentence, the first database associating phrases and synonyms of the phrases with each other;
calculating, by the processor, an N-gram value according to a context dependence value corresponding to the one or more second phrases, the context dependence value being obtained from a second database, the second database associating phrases and context dependence values, corresponding to the phrases included in the second database, with each other, the context dependence value indicating a degree to which a meaning of a phrase included in the second database depends on context;
extracting, by the processor, one or more contiguous third phrases that include a number of second phrases equivalent to the N-gram value from one or more second sentences obtained by replacing, in the first sentence, the first phrase with the one or more second phrases;
calculating, by the processor, an appearance frequency of the one or more third phrases in a third database, the third database associating phrases and appearance frequencies of the phrases, in the third database, with each other;
determining, by the processor, whether the calculated appearance frequency is larger than or equal to a threshold;
determining, by the processor, if the calculated appearance frequency is determined to be larger than or equal to the threshold, the one or more second sentences as a substitute of the first sentence;
outputting, by the processor, the one or more second sentences as the substitute, to an external device;
generating, by the processor, an updated translation model based on the determination of the substitute; and
perform, by the processor, a machine translation using the updated translation model.
1 Assignment
0 Petitions
Accused Products
Abstract
In a method of creating similar sentences from an entered original, one or more second phrases having the same meaning as a first phrase, which is part of the original, are extracted from a first database; an N-gram value is calculated according to a context dependence value, in a second database, corresponding to the one or more second phrases; one or more contiguous third phrases that include a number of second phrases equivalent to the N-gram value are extracted from one or more sentences obtained by replacing, in the original, the first phrase with the one or more second phrases; the appearance frequency of the one or more third phrases in a third database is calculated; and if the calculated appearance frequency is determined to be larger than or equal to a threshold, the one or more sentences are used as similar sentences of the original and are externally output.
16 Citations
11 Claims
-
1. A method of creating similar sentences from an original sentence to be translated, the method comprising:
-
accepting, by a processor, a first sentence including a first phrase; extracting, from a first database by the processor, one or more second phrases having the same meaning as the first phrase, the first phrase being part of a plurality of phrases constituting the first sentence, the first database associating phrases and synonyms of the phrases with each other; calculating, by the processor, an N-gram value according to a context dependence value corresponding to the one or more second phrases, the context dependence value being obtained from a second database, the second database associating phrases and context dependence values, corresponding to the phrases included in the second database, with each other, the context dependence value indicating a degree to which a meaning of a phrase included in the second database depends on context; extracting, by the processor, one or more contiguous third phrases that include a number of second phrases equivalent to the N-gram value from one or more second sentences obtained by replacing, in the first sentence, the first phrase with the one or more second phrases; calculating, by the processor, an appearance frequency of the one or more third phrases in a third database, the third database associating phrases and appearance frequencies of the phrases, in the third database, with each other; determining, by the processor, whether the calculated appearance frequency is larger than or equal to a threshold; determining, by the processor, if the calculated appearance frequency is determined to be larger than or equal to the threshold, the one or more second sentences as a substitute of the first sentence; outputting, by the processor, the one or more second sentences as the substitute, to an external device; generating, by the processor, an updated translation model based on the determination of the substitute; and perform, by the processor, a machine translation using the updated translation model. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A non-transitory computer-readable recording medium storing a program that causes a computer to function as an apparatus that creates similar sentences from an original sentence to be translated, wherein the program causes the computer to execute processing to:
-
accept a first sentence including a first phrase; extract, from a first database, one or more second phrases having the same meaning as the first phrase, the first phrase being part of a plurality of phrases constituting the first sentence, the first database associating phrases and synonyms of the phrases with each other; calculate an N-gram value according to a context dependence value corresponding to the one or more second phrases, the context dependence value being obtained from a second database, the second database associating phrases and context dependence values, corresponding to the phrases included in the second database, with each other, the context dependence value indicating a degree to which a meaning of a phrase included in the second database depends on context; extract one or more contiguous third phrases that include a number of second phrases equivalent to the N-gram value from one or more second sentences obtained by replacing, in the first sentence, the first phrase with the one or more second phrases; calculate an appearance frequency of the one or more third phrases in a third database, the third database associating phrases and appearance frequencies of the phrases, in the third database, with each other; determine whether the calculated appearance frequency is larger than or equal to a threshold; and determine, if the calculated appearance frequency is determined to be larger than or equal to the threshold, the one or more second sentences as a substitute of the first sentence; output the one or more second sentences as the substitute, to an external device; generate an updated translation model based on the determination of the substitute; and perform a machine translation using the updated translation model.
-
-
10. An apparatus that creates similar sentences from an original sentence to be translated, the apparatus comprising:
-
an acceptor that accepts a first sentence including a first phrase; a second phrase extractor that extracts, from a first database, one or more second phrases having the same meaning as the first phrase, the first phrase being part of a plurality of phrases constituting the first sentence the first database associating phrases and synonyms of the phrases with each other; a calculator that calculates an N-gram value according to a context dependence value corresponding to the one or more second phrases, the context dependence value being obtained from a second database, the second database associating phrases and context dependence values, corresponding to the phrases included in the second database, with each other, the context dependence value indicating a degree to which a meaning of a phrase included in the second database depends on context; a third phrase extractor that extracts one or more contiguous third phrases that include a number of second phrases equivalent to the N-gram value from one or more second sentences obtained by replacing, in the first sentence, the first phrase with the one or more second phrases; a calculator that calculates an appearance frequency of the one or more third phrases in a third database, the third database associating phrases and appearance frequencies of the phrases, in the third database, with each other; a determiner that determines whether the calculated appearance frequency is larger than or equal to a threshold; an outputer that, if the calculated appearance frequency is determined to be larger than or equal to the threshold, uses the one or more second sentences as a substitute of the first sentence, and outputs the one or more second sentences as the substitute, to an external device; a translation model creator that generates an updated translation model based on the determination of the substitute; and a machine translator that performs a machine translation using the updated translation model. - View Dependent Claims (11)
-
Specification