Method and system to analyze, transfer and generate language expressions using compiled instructions to manipulate linguistic structures
First Claim
1. A method for translating an expression from a source natural language into a target natural language using compiled grammars generated from grammar rules compiled by a grammar programming language compiler, the method comprising:
- providing the compiled grammars comprising a compiled parsing grammar, a compiled transfer grammar and a compiled generation grammar; and
manipulating a representation of the expression as directed by the compiled grammars to translate the expression from the source natural language into the target natural language, wherein manipulating the representation comprises;
building a syntax parsing data structure for the expression as directed by the compiled parsing grammar, the syntax parsing data structure containing a source natural language feature structure for the expression;
creating a target natural language feature structure for the expression from the source natural language feature structure for the expression as directed by the compiled transfer grammar; and
building a syntactic generation data structure from the target natural language feature structure as directed by the compiled generation grammar, the syntactic generation data structure representing at least one valid translation of the expression.
1 Assignment
0 Petitions
Accused Products
Abstract
A natural language translation system contains language-neutral modules for syntactic analysis, transfer, and morphological and syntactical generation of feature structures for an input expression in a source and a target language. The language-neutral modules are driven by language-specific grammars to translate between the specified languages so that no knowledge about the languages need be incorporated into the modules themselves. The modules interface with the grammar rules in the form of compiled grammar programming language statements that perform the required manipulation of the feature structures. Because the modules are language-neutral, the system is readily adaptable to new languages simply by providing a grammar for the new language. Multiple copies of each module, each interfacing with a different natural language grammar, enables simultaneous translation of multiple languages in the same system.
-
Citations
52 Claims
-
1. A method for translating an expression from a source natural language into a target natural language using compiled grammars generated from grammar rules compiled by a grammar programming language compiler, the method comprising:
-
providing the compiled grammars comprising a compiled parsing grammar, a compiled transfer grammar and a compiled generation grammar; and
manipulating a representation of the expression as directed by the compiled grammars to translate the expression from the source natural language into the target natural language, wherein manipulating the representation comprises;
building a syntax parsing data structure for the expression as directed by the compiled parsing grammar, the syntax parsing data structure containing a source natural language feature structure for the expression;
creating a target natural language feature structure for the expression from the source natural language feature structure for the expression as directed by the compiled transfer grammar; and
building a syntactic generation data structure from the target natural language feature structure as directed by the compiled generation grammar, the syntactic generation data structure representing at least one valid translation of the expression. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
creating nodes in the syntax data structure by applying a set of rules in the compiled parsing grammar as directed by a series of parsing actions, each node associated with a feature structure representing at least a portion of the expression and any feature structure created at a root node of the syntax data structure representing the expression.
-
-
7. The method of claim 6, wherein the parsing actions in the series are selected from a parsing table associated with the compiled parsing grammar for the natural language.
-
8. The method of claim 6, wherein the rules in the set are determined based on the feature structures associated with the nodes.
-
9. The method of claim 1, wherein creating the target natural language feature structure comprises:
-
substituting a target natural language feature structure representing the expression for a source natural language feature structure representing the expression when the source natural language feature structure matches a bilingual example feature structure;
applying a set of compiled transfer grammar rules to the source language feature structure to build a transfer generation data structure having a plurality of source language feature sub-structures representing portions of the expression;
substituting a target natural language feature sub-structure representing a portion of the expression for a source natural language feature sub-structure representing the portion when the source natural language feature sub-structure matches a bilingual example feature structure; and
separately applying a specific set of compiled transfer grammar rules to each source natural language feature sub-structure that does not match a bilingual feature structure to build a plurality of transfer generation data structures to create new source natural language feature sub-structures until a target natural language feature sub-structure has been substituted for each source natural language feature sub-structure.
-
-
10. The method of claim 9, wherein creating the target natural language feature structure further comprises:
determining the specific set of compiled transfer grammar rules to apply based on each source natural language feature sub-structure that does not match a bilingual feature structure.
-
11. The method of claim 9, wherein the transfer generation data structure comprises:
-
a root node associated with a first source language feature sub-structure, the first source language feature sub-structure representing a portion of the expression;
at least one rule node below the root node, the at least one rule node being associated with one rule from the specific set of compiled transfer grammar rules; and
at least one symbol node below the at least one rule node, the at least one symbol node being associated with a source language feature sub-structure for a part of the portion.
-
-
12. The method of claim 1, wherein building the syntactic generation data structure comprises:
-
applying a set of compiled generation grammar rules to build a syntactic generation data structure from a feature structure representing at least one version of the expression; and
choosing a list of feature structures associated with leaf nodes in the syntactic generation data structure to represent the expression.
-
-
13. The method of claim 12, wherein applying the set of compiled generation grammar rules comprises:
-
repeatedly applying a specific set of compiled generation grammar rules to a feature structure associated with a symbol node to create a level of rule nodes below the symbol node, each rule node associated with a rule satisfied by the feature structure; and
repeatedly applying the rule associated with a rule node to the symbol node above the rule node to create a level of symbol nodes below the rule node, each symbol node associated with a feature structure created by applying the rule, until the features structures associated with the symbol nodes represent words in the expression.
-
-
14. The method of claim 12, wherein choosing the list of feature structures comprises tracing a traverse path through the data structure until all leaf nodes on the traverse path have been reached.
-
15. The method of claim 12, wherein choosing the list of feature structures results in a single version of the expression.
-
16. The method of claim 12, wherein choosing the list of feature structures results in multiple versions of the expression.
-
17. A computer-readable medium having stored thereon executable instructions to cause a computer to perform a method for translating an expression from a source natural language into a target natural language using compiled grammars generated from grammar rules compiled by a grammar programming language compiler, the method comprising:
-
providing the compiled grammars comprising a compiled parsing grammar, a compiled transfer grammar and a compiled generation grammar; and
manipulating a representation of the expression as directed by the compiled grammar to translate the expression from the source natural language into the target natural language, wherein manipulating the representation comprises;
building a syntax parsing data structure for the expression as directed by the compiled parsing grammar, the syntax parsing data structure containing a source natural language feature structure for the expression;
creating a target natural language feature structure for the expression from the source natural language feature structure for the expression as directed by the compiled transfer grammar; and
building a syntactic generation data structure from the target natural language feature structure as directed by the compiled generation grammar, the syntactic generation data structure representing at least one valid translation of the expression. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32)
creating nodes in a syntax data structure by applying a set of rules in the compiled parsing grammar as directed by a series of parsing actions, each node associated with a feature structure representing at least a portion of the expression and any feature structure created at a root node of the syntax data structure representing the expression.
-
-
23. The computer-readable medium of claim 22, wherein the parsing actions in the series are selected from a parsing table associated with the compiled parsing grammar for the natural language.
-
24. The computer-readable medium of claim 22, wherein the rules in the set are determined based on the feature structures associated with the nodes.
-
25. The computer-readable medium of claim 17, wherein creating the target natural language feature structure comprises:
-
substituting a target natural language feature structure representing the expression for a source natural language feature structure representing the expression when the source natural language feature structure matches a bilingual example feature structure;
applying a set of compiled transfer grammar rules to the source language feature structure to build a transfer generation data structure having a plurality of source language feature sub-structures representing portions of the expression;
substituting a target natural language feature sub-structure representing a portion of the expression for a source natural language feature sub-structure representing the portion when the source natural language feature sub-structure matches a bilingual example feature structure; and
separately applying a specific set of compiled transfer grammar rules to each source natural language feature sub-structure that does not match a bilingual feature structure to build a plurality of transfer generation data structures to create new source natural language feature sub-structures until a target natural language feature sub-structure has been substituted for each source natural language feature sub-structure.
-
-
26. The computer-readable medium of claim 25, wherein creating the target natural language feature structure further comprises:
determining the specific set of compiled transfer grammar rules to apply based on each source natural language feature sub-structure that does not match a bilingual feature structure.
-
27. The computer-readable medium of claim 25, wherein the transfer generation data structure comprises:
-
a root node associated with a first source language feature sub-structure, the first source language feature sub-structure representing a portion of the expression;
at least one rule node below the root node, the at least one rule node being associated with one rule from the specific set of compiled transfer grammar rules; and
at least one symbol node below the at least one rule node, the at least one symbol node being associated with a source language feature sub-structure for a part of the portion.
-
-
28. The computer-readable medium of claim 17, wherein building the syntactic generation data structure comprises:
-
applying a set of compiled generation grammar rules to build a syntactic generation data structure from a feature structure representing at least one version of the expression; and
choosing a list of feature structures associated with leaf nodes in the syntactic generation data structure to represent the expression.
-
-
29. The computer-readable medium of claim 28, wherein applying the set of compiled generation grammar rules comprises:
-
repeatedly applying a specific set of compiled generation grammar rules to a feature structure associated with a symbol node to create a level of rule nodes below the symbol node, each rule node associated with a rule satisfied by the feature structure; and
repeatedly applying the rule associated with a rule node to the symbol node above the rule node to create a level of symbol nodes below the rule node, each symbol node associated with a feature structure created by applying the rule, until the features structures associated with the symbol nodes represent words in the expression.
-
-
30. The computer-readable medium of claim 28, wherein choosing the list of feature structures comprises tracing a traverse path through the data structure until all leaf nodes on the traverse path have been reached.
-
31. The computer-readable medium of claim 28, wherein choosing the list of feature structures results in a single version of the expression.
-
32. The computer-readable medium of claim 28, wherein choosing the list of feature structures results in multiple versions of the expression.
-
33. A system for translating an expression from a source natural language into a target natural language using compiled grammars using compiled grammars generated from grammar rules compiled by a grammar programming language compiler, the system comprising:
-
a processing unit;
a memory coupled to the processing unit through a system bus;
a computer-readable medium coupled to the processing unit through the system bus and containing thereon the compiled grammars comprising a compiled parsing grammar, a compiled transfer grammar and a compiled generation grammar;
a plurality of language-neutral modules executed from the computer-readable medium by the processing unit, the plurality of language-neutral modules causing the processing unit to apply rules in the compiled grammars to a representation of the expression in the source natural language to create a representation of the expression in the target natural language, wherein the plurality of language-neutral modules comprises;
a syntactic analysis module that causes the processing unit to build a syntax data structure from the representation of the expression in the source natural language by executing the instructions in the interface functions for the rules in the compiled parsing grammar;
a transfer module that causes the processing unit to create a representation for the expression in the target natural language corresponding to the representation of the expression in the source natural language contained in the syntax data structure by executing the instructions in the interface functions for the rules in the compiled transfer grammar; and
a syntactic generation module that causes the processing unit to build a syntactic generation data structure from the representation of the expression in the target natural language by executing the instructions in the interface functions for the rules in the compiled generation grammar, the syntactic generation data structure representing at least one valid translation of the expression; and
a plurality of interface functions executed from the computer-readable medium by the processing unit, each interface function associated with a rule in one of the compiled grammars, wherein applying a rule in the compiled grammar causes the processing unit to execute instructions in the interface function corresponding to the rule. - View Dependent Claims (34, 35, 36)
-
-
37. An apparatus for translating an expression from a source natural language to a target natural language using compiled grammars generated from grammar rules compiled by a grammar programming language compiler, the apparatus comprising:
-
means for building a syntax parsing data structure for the expression as directed by a compiled parsing grammar, the syntax parsing data structure containing a source natural language feature structure for the expression;
means for creating a target natural language feature structure for the expression from the source natural language feature structure for the expression as directed by a compiled transfer grammar; and
means for building a syntactic generation data structure from the target natural language feature structure as directed by a compiled generation grammar, the syntactic generation data structure representing at least one valid translation of the expression. - View Dependent Claims (38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52)
using a plurality of feature structures, each representing a word in the expression in the source natural language.
-
-
39. The apparatus of claim 37, wherein the means for creating a target natural language feature structure comprises:
-
means for matching the source natural language feature structure against a set of bilingual example feature structures; and
means for substituting any portion of the source natural language feature structure that is matched.
-
-
40. The apparatus of claim 37, wherein the means creating the target natural language feature structure comprises:
means for building a transfer generation data structure from the source natural language feature structure.
-
41. The apparatus of claim 40, wherein the transfer generation data structure comprises a plurality of sub-transfer generation data structures and the means for building the transfer generation data structure comprises:
means for invoking a plurality of sub-transfer processes, each sub-transfer process responsible for building a sub-transfer generation data structure.
-
42. The apparatus of claim 37, wherein the means for building the syntax parsing data structure comprises:
means for creating nodes in the syntax data structure by applying a set of rules in the compiled parsing grammar as directed by a series of parsing actions, each node associated with a feature structure representing at least a portion of the expression and any feature structure created at a root node of the syntax data structure representing the expression.
-
43. The apparatus of claim 42, wherein the parsing actions in the series are selected from a parsing table associated with the compiled parsing grammar for the natural language.
-
44. The apparatus of claim 42, wherein the rules in the set are determined based on the feature structures associated with the nodes.
-
45. The apparatus of claim 37, wherein the means for creating the target natural language feature structure comprises:
-
means for substituting a target natural language feature structure representing the expression for a source natural language feature structure representing the expression when the source natural language feature structure matches a bilingual example feature structure;
means for applying a set of compiled transfer grammar rules to the source language feature structure to build a transfer generation data structure having a plurality of source language feature sub-structures representing portions of the expression;
means for substituting a target natural language feature sub-structure representing a portion of the expression for a source natural language feature sub-structure representing the portion when the source natural language feature sub-structure matches a bilingual example feature structure; and
means for separately applying a specific set of compiled transfer grammar rules to each source natural language feature sub-structure that does not match a bilingual feature structure to build a plurality of transfer generation data structures to create new source natural language feature sub-structures until a target natural language feature sub-structure has been substituted for each source natural language feature sub-structure.
-
-
46. The apparatus of claim 45, wherein the means for creating the target natural language feature structure further comprises:
means for determining the specific set of compiled transfer grammar rules to apply based on each source natural language feature sub-structure that does not match a bilingual feature structure.
-
47. The apparatus of claim 45, wherein the transfer generation data structure comprises:
-
a root node associated with a first source language feature sub-structure, the first source language feature sub-structure representing a portion of the expression;
at least one rule node below the root node, the at least one rule node being associated with one rule from the specific set of compiled transfer grammar rules; and
at least one symbol node below the at least one rule node, the at least one symbol node being associated with a source language feature sub-structure for a part of the portion.
-
-
48. The apparatus of claim 40, wherein the means for building the syntactic generation data structure comprises:
-
means for applying a set of compiled generation grammar rules to build a syntactic generation data structure from a feature structure representing at least one version of the expression; and
means for choosing a list of feature structures associated with leaf nodes in the syntactic generation data structure to represent the expression.
-
-
49. The apparatus of claim 47, wherein the means for applying the set of compiled generation grammar rules comprises:
-
means for repeatedly applying a specific set of compiled generation grammar rules to a feature structure associated with a symbol node to create a level of rule nodes below the symbol node, each rule node associated with a rule satisfied by the feature structure; and
means for repeatedly applying the rule associated with a rule node to the symbol node above the rule node to create a level of symbol nodes below the rule node, each symbol node associated with a feature structure created by applying the rule, until the features structures associated with the symbol nodes represent words in the expression.
-
-
50. The apparatus of claim 47, wherein the means for choosing the list of feature structures comprises tracing a traverse path through the data structure until all leaf nodes on the traverse path have been reached.
-
51. The apparatus of claim 47, wherein the means for choosing the list of feature structures produces a single version of the expression.
-
52. The apparatus of claim 47, wherein the means for choosing the list of feature structures produces multiple versions of the expression.
Specification