Natural language processing apparatus and method for converting word notation grammar description data
First Claim
1. A natural language processing apparatus comprising:
- grammar description data storage means for storing word notation grammar description data that describe construction rules for character strings which constitute words belonging to a specific category;
notation non-specific dictionary in which is explained word information for a word group that belongs to the specific category that is used for a notation;
analysis means for, based on the word notation grammar description data, extracting as a word, from an input natural language sentence, a character string that satisfies one of the construction rules, and for analyzing the extracted word with referring to word information that is explained in said notation non-specific dictionary by using as a notation a category that corresponds to the one of the construction rules.
1 Assignment
0 Petitions
Accused Products
Abstract
A natural language processing apparatus converts word notation grammar description data, which explain the rules for the construction of words that belong to categories having a variety of notations, such as the names of product models, onomatopoeic words and numerical expressions, into word notation context free grammar data that are expressed in the form of an expanded context free grammar. In the analysis processing according to the context free grammar, a character string that satisfies the construction rule for a product model name, an onomatopoeic word or a numerical expression is extracted as a word from the natural language sentence that is an input in accordance with the word notation context free grammar data. Further, in accordance with a category that corresponds to the construction rule, the part of speech and the pronunciation of the word are determined by referring to a dictionary in which word information for each category is explained. Finally, the natural language sentence that is input is rendered vocally.
-
Citations
17 Claims
-
1. A natural language processing apparatus comprising:
-
grammar description data storage means for storing word notation grammar description data that describe construction rules for character strings which constitute words belonging to a specific category;
notation non-specific dictionary in which is explained word information for a word group that belongs to the specific category that is used for a notation;
analysis means for, based on the word notation grammar description data, extracting as a word, from an input natural language sentence, a character string that satisfies one of the construction rules, and for analyzing the extracted word with referring to word information that is explained in said notation non-specific dictionary by using as a notation a category that corresponds to the one of the construction rules. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
word notation grammar conversion means for converting said word notation grammar description data into word notation context free grammar data that are represented in the form of an expanded context free grammar, wherein said analysis means refers to said word notation context free grammar data for performing an analysis based on context free grammar.
-
-
3. A natural language processing apparatus according to claim 1, wherein, when a plurality of character strings beginning at the same character position are extracted as different words, said analysis means selects, as a result of an analysis, a word having the longest character string.
-
4. A natural language processing apparatus according to claim 1, wherein a construction rule for a notation for a numeral is explained in said word notation grammar description data.
-
5. A natural language processing apparatus according to claim 4, wherein a range or a property of a numerical portion that is included in said notation for said numeral is explained in said construction rule.
-
6. A natural language processing apparatus according to claim 1, wherein a construction rule for a notation for an onomatopoeic or a mimetic word is explained in said word notation grammar description data.
-
7. A natural language processing apparatus according to claim 6, wherein a character in a standard notation that can be repeated is identified in said construction rule.
-
8. A natural language processing apparatus according to claim 1, wherein a construction rule for a notation for a product name is explained in a notation non-specific dictionary.
-
9. A natural language processing apparatus according to claim 8, wherein a fixed character string and a range for a variable value are explained in said construction rule.
-
10. A natural language processing apparatus according to claim 1, further comprising:
voice output means for vocally rendering said natural language sentence that is input based on a result obtained by an analysis performed by said analysis means.
-
11. A natural language processing apparatus according to claim 1, further comprising:
voice output means for determining the pronunciation of said word that is extracted based on a rule for the pronunciation of notations that are written in said notation non-specific dictionary, and for vocally rendering said natural language sentence that is input.
-
12. A natural language processing method comprising the steps of:
-
inputting a natural language sentence;
extracting as a word from the input natural language sentence, based on word notation grammar description data that describe construction rules for character strings which constitute words that belong to a specific category, a character string that satisfies one of the construction rules;
referring to notation non-specific dictionary in which is explained word information for a word group that belongs to the specific category that is used for a notation and analyzing the extracted word based on word information that is explained in said notation non-specific dictionary by using as a notation a category that corresponds to the one of the construction rules; and
outputting the result of the analysis. - View Dependent Claims (13, 14, 15, 16)
a voice output step of vocally rendering said natural language sentence that is input based on a result obtained by an analysis performed at said analysis step.
-
-
16. A natural language processing method according to claim 12, wherein at said outputting step, determined is the pronunciation of said word that is extracted based on a rule for the pronunciation of notations that are written in a notation-specific dictionary.
-
17. A computer-readable storage medium on which is stored a program for controlling a computer that performs natural language processing, said program comprising codes for causing said computer to perform the steps of:
-
inputting a natural language sentence;
extracting as a word from the input natural language sentence, based on word notation grammar description data that describe construction rules for character strings which constitute words that belong to a specific category, a character string that satisfies one of the construction rules;
referring to notation non-specific dictionary in which is explained word information for a word group that belongs to the specific category that is used for a notation and analyzing the extracted word based on word information that is explained in said notation non-specific dictionary by using as a notation a category that corresponds to the one of the construction rules; and
outputting the result of the analysis.
-
Specification