Anaphora analyzing apparatus provided with antecedent candidate rejecting means using candidate rejecting decision tree
First Claim
1. An anaphora analyzing apparatus comprising:
- analyzing means for analyzing an input natural language sentence and outputting analyzed results;
storing means for storing the analyzed results outputted from said analyzing means;
antecedent candidate generating means for detecting a target component in the input natural language sentence required for anaphora analysis in accordance with the current analyzed results outputted from said analyzing means and the past analyzed results stored in said storing means, and for generating antecedent candidates corresponding to said target component;
candidate rejecting means for rejecting unnecessary candidates having no potential for anaphora referential relation among the antecedent candidates generated by said antecedent candidate generating means by using a predetermined rejecting criterion, and for outputting the remaining antecedent candidates, said rejecting criterion being of a decision tree obtained by using a machine training method in accordance with a training tagged corpus to which predetermined word information is given for each word of the training tagged corpus;
preference giving means for calculating a predetermined estimated value for each of the remaining antecedent candidates outputted from said candidate rejecting means, by referring to an information table including predetermined estimation information obtained from a predetermined further training tagged corpus, for giving the antecedent candidates preference in accordance with the calculated estimated value, and for outputting preferenced antecedent candidates; and
candidate deciding means for deciding and outputting a predetermined number of antecedent candidates based on the given preference in accordance with the preferenced antecedent candidates outputted from said preference giving means.
1 Assignment
0 Petitions
Accused Products
Abstract
An anaphora analyzing apparatus is disclosed for automatically estimating an anaphora referential relation or an antecedent of a noun for use in a natural language sentence. A storage unit stores analyzed results outputted from an analyzer, and an antecedent candidate generator detects a target component required for anaphora analysis in accordance with the current analyzed results and the past analyzed results stored in the storage unit, and generates antecedent candidates corresponding to the target component. Then, a candidate rejecting section rejects unnecessary candidates having no potential for anaphora referential relation among the antecedent candidates by using a predetermined rejecting criterion, and outputs the remaining antecedent candidates. Further, a preference giving section calculates a predetermined estimated value for each of the remaining antecedent candidates by referring to an information table including predetermined estimation information obtained from a predetermined training tagged corpus, and gives the antecedent candidates preference in accordance with the calculated estimated value. Finally, a candidate deciding section decides a predetermined number of antecedent candidates based on a given preference in accordance with the preferenced antecedent candidates.
-
Citations
5 Claims
-
1. An anaphora analyzing apparatus comprising:
-
analyzing means for analyzing an input natural language sentence and outputting analyzed results;
storing means for storing the analyzed results outputted from said analyzing means;
antecedent candidate generating means for detecting a target component in the input natural language sentence required for anaphora analysis in accordance with the current analyzed results outputted from said analyzing means and the past analyzed results stored in said storing means, and for generating antecedent candidates corresponding to said target component;
candidate rejecting means for rejecting unnecessary candidates having no potential for anaphora referential relation among the antecedent candidates generated by said antecedent candidate generating means by using a predetermined rejecting criterion, and for outputting the remaining antecedent candidates, said rejecting criterion being of a decision tree obtained by using a machine training method in accordance with a training tagged corpus to which predetermined word information is given for each word of the training tagged corpus;
preference giving means for calculating a predetermined estimated value for each of the remaining antecedent candidates outputted from said candidate rejecting means, by referring to an information table including predetermined estimation information obtained from a predetermined further training tagged corpus, for giving the antecedent candidates preference in accordance with the calculated estimated value, and for outputting preferenced antecedent candidates; and
candidate deciding means for deciding and outputting a predetermined number of antecedent candidates based on the given preference in accordance with the preferenced antecedent candidates outputted from said preference giving means. - View Dependent Claims (2, 3, 4, 5)
wherein said candidate rejecting means selects and outputs one or more antecedent candidates when all the antecedent candidates are rejected by said candidate rejecting means. -
3. The anaphora analyzing apparatus as claimed in claim 1,
wherein said estimation information for said information table includes frequency information obtained from the predetermined further training tagged corpus. -
4. The anaphora analyzing apparatus as claimed in claim 1,
wherein said estimation information for said information table includes a distance between the target component for anaphora analysis and antecedent candidates obtained from the predetermined further training tagged corpus. -
5. The anaphora analyzing apparatus as claimed in claim 1,
wherein said estimation information for said information table includes predetermined information calculated in accordance with frequency information obtained from the predetermined further training tagged corpus and a distance between a target component for anaphora analysis and antecedent candidates obtained from the predetermined further training tagged corpus.
-
Specification