Language conversion rule preparing device, language conversion device and program recording medium
First Claim
1. A language transference rule producing apparatus characterized in that said apparatus comprises:
- a parallel-translation corpus;
a phrase extracting section which calculates a frequency of adjacency of words or parts of speech in a source language sentence and a target language sentence in said parallel-translation corpus, and couples words and parts of speech of a high frequency of adjacency to extract partial sentences in each of which semantic consistency is formed;
a phrase determining section which, among the partial sentences extracted by said phrase extracting section, checks relationships between the partial sentences of the source language and the target language with respect to a whole of a sentence to determine corresponding partial sentences;
a phrase dictionary which stores the determined corresponding partial phrases,said phrase dictionary is used when language transference is performed, and the language transference, when a source language sentence is input, matches the input sentence with the corresponding partial phrases stored in said phrase dictionary, thereby performing language or style transference;
a morphological analyzing section which transfers the source language sentence of the parallel-translation corpus into a word string; and
a semantic coding section which, by using a result of said morphological analyzing section, on a basis of a table in which words are classified while deeming words that are semantically similar, to be in a same class, and a same code is given to words in a same class, produces a parallel-translation corpus in which words of a part or all of the source language sentence and the target language sentence are replaced with codes of the classified vocabulary table, andsaid phrase extracting section extracts phrases from the parallel-translation corpus in which words are replaced with codes by said semantic coding section.
2 Assignments
0 Petitions
Accused Products
Abstract
When a portion of an input speech sentence contains an untrained portion or when speech recognition is partly erroneously performed, transference to the target language is disabled. Moreover, a phrase dictionary and interphrase rules which are necessary for transference must be manually produced. Therefore, development is low in efficiency and requires much labor.
An apparatus includes: a language rule producing section which statistically automatically trains grammatical or semantic restriction rules for a partial word or a word string from a parallel-translation corpus, and in which rules are described in the form wherein a source language partial sentence corresponds to a target language partial sentence; a speech recognizing section which performs speech recognition on speech of the source language by using the produced language rules, and which outputs a result of the recognition; and a language transferring section which transfers a source language sentence into a target language sentence by using the same language rules. Even when a portion of an input speech sentence contains an untrained portion or when speech recognition is partly erroneously performed, transference to the target language is surely enabled. Moreover, a phrase dictionary and interphrase rules which are necessary for transference can be automatically produced without requiring much manual assistance.
56 Citations
3 Claims
-
1. A language transference rule producing apparatus characterized in that said apparatus comprises:
-
a parallel-translation corpus; a phrase extracting section which calculates a frequency of adjacency of words or parts of speech in a source language sentence and a target language sentence in said parallel-translation corpus, and couples words and parts of speech of a high frequency of adjacency to extract partial sentences in each of which semantic consistency is formed; a phrase determining section which, among the partial sentences extracted by said phrase extracting section, checks relationships between the partial sentences of the source language and the target language with respect to a whole of a sentence to determine corresponding partial sentences; a phrase dictionary which stores the determined corresponding partial phrases, said phrase dictionary is used when language transference is performed, and the language transference, when a source language sentence is input, matches the input sentence with the corresponding partial phrases stored in said phrase dictionary, thereby performing language or style transference; a morphological analyzing section which transfers the source language sentence of the parallel-translation corpus into a word string; and a semantic coding section which, by using a result of said morphological analyzing section, on a basis of a table in which words are classified while deeming words that are semantically similar, to be in a same class, and a same code is given to words in a same class, produces a parallel-translation corpus in which words of a part or all of the source language sentence and the target language sentence are replaced with codes of the classified vocabulary table, and said phrase extracting section extracts phrases from the parallel-translation corpus in which words are replaced with codes by said semantic coding section. - View Dependent Claims (2)
-
-
3. A language transference rule producing apparatus characterized in that said apparatus comprises:
-
a parallel-translation corpus prepared for learning; a phrase extracting section which calculates a frequency of adjacency of words or parts of speech in a source language sentence and a target language sentence in said parallel-translation corpus, and couples words and parts of speech of a high frequency of adjacency to extract automatically partial sentences in each of which semantic consistency is formed without using any grammatical rule; a phrase determining section which, among the partial sentences extracted by said phrase extracting section, checks relationships between partial sentences of the source language and the target language to determine corresponding partial sentences; and a phrase dictionary which stores the determined corresponding partial phrases, said phrase dictionary is used when language transference is performed, and the language transference, when a source language sentence is input, matches the input sentence with the corresponding partial phrases stored in said phrases dictionary, thereby performing language or style transference.
-
Specification