Method for specifying equivalence of language grammars and automatically translating sentences in one language to sentences in another language in a computer environment
First Claim
1. A method of automatic translation of sentences from a source language Ls selected from language L1 to Ln to a target language Lt selected from languages L1 to Ln, in which steps thereof are implemented by a computer, comprising the steps of:
- (i) providing grammars G1 to Gn of all the languages L1 to Ln respectively, in which each grammar is unique to that particular language, and a text ‘
S’
in the source language Ls as inputs;
(ii) creating a unified grammar specification UG for the grammars G1 to Gn, in which equivalent grammar production rules of each grammar G1 to Gn are combined into a single unified production rule;
(iii) separating the input text ‘
S’
in the source language Ls into a list of tokens using a lexical analyser for the source language Ls;
(iv) setting a current non-terminal symbol to the start symbol of the unified grammar specification UG;
(v) obtaining a set of the grammar production rules from the united grammar specification UG, which contain the current non-terminal symbol as their target non-terminal;
(vi) for each unified grammar production rule P in the set of the grammar production rules obtained from the previous step (v), taking each symbol one by one from a list of terminal symbols and/or non-terminal symbols corresponding to the source language grammar Gs, determining whether it is a terminal symbol or a non-terminal symbol;
(vii) for each terminal symbol obtained from the previous step, which is equivalent to a corresponding symbol in the list of tokens T of the input text in the source language Ls, considering the next symbol in said list of terminal symbols and/or non-terminal symbols corresponding to the source language grammar Gs and for each non-terminal symbol Es obtained from the previous step, repeating step (v) onwards with Es as the current non-terminal symbol;
(viii) if all the symbols in the said list of terminal symbols and/or non-terminal symbols corresponding to the source language grammar Gs match with all the symbols in the list of tokens T of the input text in the source language Ls, obtaining a list of symbols t corresponding to the target language grammar Gt from the unified grammar production rule P and for those symbols which do not match, repeating step (vi) onwards for a next unified grammar production rule P defined for the non-terminal symbol ‘
E’
;
(ix) taking each symbol one by one, from the list of symbols t corresponding to the target grammar Gt and determining whether it is a terminal symbol or a non-terminal symbol;
(x) for each terminal symbol obtained from the previous step outputting the symbol, and considering the next symbol and for each non-terminal obtained from the previous step, obtaining another unified grammar production rule P corresponding to that non-terminal symbol and repeating the previous step with the new unified grammar production rule, till all the symbols in the list of symbols t corresponding to the target language grammar Gt are exhausted.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for specifying equivalence of language grammars and automatically translating sentences in one language to sentences in another language in a computer environment. The method uses a unified grammar specification of grammars of different languages in a single unified representation of all the individual grammars where equivalent production rules of each of the grammars are merged into a single unified production rule. This method can be used to represent the equivalence of computer languages like high level language, assembly language and machine language and for translating sentences in any of these languages to another language.
-
Citations
4 Claims
-
1. A method of automatic translation of sentences from a source language Ls selected from language L1 to Ln to a target language Lt selected from languages L1 to Ln, in which steps thereof are implemented by a computer, comprising the steps of:
-
(i) providing grammars G1 to Gn of all the languages L1 to Ln respectively, in which each grammar is unique to that particular language, and a text ‘
S’
in the source language Ls as inputs;(ii) creating a unified grammar specification UG for the grammars G1 to Gn, in which equivalent grammar production rules of each grammar G1 to Gn are combined into a single unified production rule; (iii) separating the input text ‘
S’
in the source language Ls into a list of tokens using a lexical analyser for the source language Ls;(iv) setting a current non-terminal symbol to the start symbol of the unified grammar specification UG; (v) obtaining a set of the grammar production rules from the united grammar specification UG, which contain the current non-terminal symbol as their target non-terminal; (vi) for each unified grammar production rule P in the set of the grammar production rules obtained from the previous step (v), taking each symbol one by one from a list of terminal symbols and/or non-terminal symbols corresponding to the source language grammar Gs, determining whether it is a terminal symbol or a non-terminal symbol; (vii) for each terminal symbol obtained from the previous step, which is equivalent to a corresponding symbol in the list of tokens T of the input text in the source language Ls, considering the next symbol in said list of terminal symbols and/or non-terminal symbols corresponding to the source language grammar Gs and for each non-terminal symbol Es obtained from the previous step, repeating step (v) onwards with Es as the current non-terminal symbol; (viii) if all the symbols in the said list of terminal symbols and/or non-terminal symbols corresponding to the source language grammar Gs match with all the symbols in the list of tokens T of the input text in the source language Ls, obtaining a list of symbols t corresponding to the target language grammar Gt from the unified grammar production rule P and for those symbols which do not match, repeating step (vi) onwards for a next unified grammar production rule P defined for the non-terminal symbol ‘
E’
;(ix) taking each symbol one by one, from the list of symbols t corresponding to the target grammar Gt and determining whether it is a terminal symbol or a non-terminal symbol; (x) for each terminal symbol obtained from the previous step outputting the symbol, and considering the next symbol and for each non-terminal obtained from the previous step, obtaining another unified grammar production rule P corresponding to that non-terminal symbol and repeating the previous step with the new unified grammar production rule, till all the symbols in the list of symbols t corresponding to the target language grammar Gt are exhausted. - View Dependent Claims (2)
-
-
3. An apparatus for automatic translation of sentences from a source language Ls selected from language L1 to Ln to a target language Lt selected from languages L1 to Ln comprising:
-
(i) means for providing grammars G1 to Gn of all the languages L1 to Ln respectively, in which each grammar is unique to that particular language, and a text ‘
S’
in the source language Ls as inputs;(ii) means for creating a unified grammar specification UG for the grammars G1 to Gn, in which equivalent grammar production rules of each grammar G1 to Gn are combined into a single unified production rule; (iii) means for separating the input text ‘
S’
in the source language Ls into a list of tokens using a lexical analyser for the source language Ls;(iv) means for setting a current non-terminal symbol to the start symbol of the unified grammar specification UG; (v) grammar production rule obtaining means for obtaining a set of the grammar production rules from the united grammar specification UG, which contain the current non-terminal symbol as their target non-terminal; (vi) for each unified grammar production rule P in the set of the grammar production rules obtained from the grammar production rule obtaining means, symbol taking means for taking each symbol one by one from a list of terminal symbols and/or non-terminal symbols corresponding to the source language grammar Gs, determining whether it is a terminal symbol or a non-terminal symbol; (vii) for each terminal symbol obtained from the symbol taking means, which is equivalent to a corresponding symbol in the list of tokens T of the input text in the source language Ls, means for considering the next symbol in said list of terminal symbols and/or non-terminal symbols corresponding to the source language grammar Gs and for each non-terminal symbol Es obtained from the symbol taking means, repeating obtaining a set of the grammar production rules from the united grammar specification UG by the grammar production rule obtaining means, onwards with Es as the current non-terminal symbol; (viii) if all the symbols in the said list of terminal symbols and/or non-terminal symbols corresponding to the source language grammar Gs match with all the symbols in the list of tokens T of the input text in the source language Ls, means for obtaining a list of symbols t corresponding to the target language grammar Gt from the unified grammar production rule P and for those symbols which do not match, repeating taking each symbol one by one from a list of terminal symbols and/or non-terminal symbols corresponding to the source language grammar Gs by the symbol taking means, onwards for a next unified grammar production rule P defined for the non-terminal symbol ‘
E’
;(ix) determining means for taking each symbol one by one, from the list of symbols t corresponding to the target grammar Gt and determining whether it is a terminal symbol or a non-terminal symbol; (x) for each terminal symbol obtained from the determining means, means for outputting the symbol, and considering the next symbol and for each non-terminal obtained from the determining means, means for obtaining another unified grammar production rule P corresponding to that non-terminal symbol and repeating the determining means with the new unified grammar production rule, till all the symbols in the list of symbols t corresponding to the target language grammar Gt are exhausted.
-
-
4. A computer readable medium for automatic translation of sentences from a source language Ls selected from language L1 to Ln to a target language Lt selected from languages L1 to Ln, including program instructions executable by a computer system for:
-
(i) providing grammars G1 to Gn of all the languages L1 to Ln respectively, in which each grammar is unique to that particular language, and a text ‘
S’
in the source language Ls as inputs;(ii) creating a unified grammar specification UG for the grammars G1 to Gn, in which equivalent grammar production rules of each grammar G1 to Gn are combined into a single unified production rule; (iii) separating the input text ‘
S’
in the source language Ls into a list of tokens using a lexical analyser for the source language Ls;(iv) setting a current non-terminal symbol to the start symbol of the unified grammar specification UG; (v) obtaining a set of the grammar production rules from the united grammar specification UG, which contain the current non-terminal symbol as their target non-terminal; (vi) for each unified grammar production rule P in the set of the grammar production rules obtained from the previous step (v), taking each symbol one by one from a list of terminal symbols and/or non-terminal symbols corresponding to the source language grammar Gs, determining whether it is a terminal symbol or a non-terminal symbol; (vii) for each terminal symbol obtained from the previous step, which is equivalent to a corresponding symbol in the list of tokens T of the input text in the source language Ls, considering the next symbol in said list of terminal symbols and/or non-terminal symbols corresponding to the source language grammar Gs and for each non-terminal symbol Es obtained from the previous step, repeating step (v) onwards with Es as the current non-terminal symbol; (viii) if all the symbols in the said list of terminal symbols and/or non-terminal symbols corresponding to the source language grammar Gs match with all the symbols in the list of tokens T of the input text in the source language Ls, obtaining a list of symbols t corresponding to the target language grammar Gt from the unified grammar production rule P and for those symbols which do not match, repeating step (vi) onwards for a next unified grammar production rule P defined for the non-terminal symbol ‘
E’
;(ix) taking each symbol one by one, from the list of symbols t corresponding to the target grammar Gt and determining whether it is a terminal symbol or a non-terminal symbol; (x) for each terminal symbol obtained from the previous step outputting the symbol, and considering the next symbol and for each non-terminal obtained from the previous step, obtaining another unified grammar production rule P corresponding to that non-terminal symbol and repeating the previous step with the new unified grammar production rule, till all the symbols in the list of symbols t corresponding to the target language grammar Gt are exhausted.
-
Specification