Computer-implemented method, program, and system for identifying non-self-descriptive terms in electronic documents
First Claim
1. A computer-implemented method for identifying a non-self-descriptive term in an electronic document, including a memory and a processor communicatively coupled to the memory, wherein the processor is configured to execute the steps of a method comprising:
- acquiring a noun included in corpus data;
calculating a qualifying level and qualified level in the corpus data related to each noun included in the corpus data;
identifying one or more nouns included in the corpus data having a qualifying level and/or qualified level satisfying a predetermined condition; and
presenting a term related to one or more of the nouns in the electronic document as a candidate for the non-self-descriptive term in the electronic document, wherein the qualified level of a first noun in the, corpus data is calculated by;
counting a number of occurrences (M) of the first noun in the corpus data;
counting a number of times (Mb1) the first noun is qualified by a preposition in the corpus data;
counting a number of times (Mb2) the first noun is qualified by a present of past participle in the corpus data;
counting a number of times (Mb3) the first noun is qualified by a noun adjunct in the corpus data; and
summing Mb1, Mb2 and Mb3 and dividing the sum by M to obtain the qualified level of the first noun in the corpus data.
1 Assignment
0 Petitions
Accused Products
Abstract
A computer-implemented method, program, and system for identifying non-self-descriptive terms in electronic documents. The computer-implemented method for identifying a non-self-descriptive term in an electronic document, includes a memory and a processor communicatively coupled to the memory and configured to execute the steps of a method. The method includes acquiring a noun included in the corpus data. The method further includes calculating a qualifying level and a qualified level in the corpus data related to each known in the corpus data. The method further includes identifying one or more nouns included in the corpus data as having a qualifying level and/or qualified level satisfying a predetermined condition. The method further includes presenting a term related to one or more of the nouns in the electronic document as a candidate for the non-self-descriptive term in the electronic document.
3 Citations
17 Claims
-
1. A computer-implemented method for identifying a non-self-descriptive term in an electronic document, including a memory and a processor communicatively coupled to the memory, wherein the processor is configured to execute the steps of a method comprising:
-
acquiring a noun included in corpus data; calculating a qualifying level and qualified level in the corpus data related to each noun included in the corpus data; identifying one or more nouns included in the corpus data having a qualifying level and/or qualified level satisfying a predetermined condition; and presenting a term related to one or more of the nouns in the electronic document as a candidate for the non-self-descriptive term in the electronic document, wherein the qualified level of a first noun in the, corpus data is calculated by; counting a number of occurrences (M) of the first noun in the corpus data; counting a number of times (Mb1) the first noun is qualified by a preposition in the corpus data; counting a number of times (Mb2) the first noun is qualified by a present of past participle in the corpus data; counting a number of times (Mb3) the first noun is qualified by a noun adjunct in the corpus data; and summing Mb1, Mb2 and Mb3 and dividing the sum by M to obtain the qualified level of the first noun in the corpus data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A computer readable non-transitory article of manufacture tangibly embodying computer readable instructions which, when executed, cause a computer to carry out the steps of a method, comprising:
-
acquiring a noun included in corpus data; calculating a qualifying level and qualified level in the corpus data related to each noun included in the corpus data; identifying one or more nouns included in the corpus data having a qualifying level and/or qualified level satisfying a predetermined condition; and presenting a term related to one or more of the nouns in the electronic document as a candidate for the non-self-descriptive term in the electronic document, wherein the qualified level of a first noun in the corpus data is calculated by; counting a number of occurrences (M) of the first noun in the corpus data; counting a number of times (Mb1) the first noun is qualified by a preposition in the corpus data; counting a number of times (Mb2) the first noun is qualified by a present or past participle in the corpus data; counting a number of times (Mb3) the first noun is qualified by a noun adjunct in the corpus data; and summing Mb1, Mb2 and Mb3 and dividing, the sum by M to obtain the qualified level of the first noun in the corpus data.
-
-
17. A document processing system for identifying non-self-descriptive terms in an electronic document comprising:
-
a memory; a processor communicatively coupled to the memory; and a noun extraction unit for acquiring a noun included in corpus data; a qualification relationship analysis unit for calculating a qualifying level and a qualified level in the corpus data related to each noun that is included in the corpus data; a condition determining unit for identifying one or more nouns included in the corpus data having as qualifying level and/or qualified level satisfying a predetermined condition; and an information processing unit for presenting a term related to one or more of the nouns in the electronic document as a candidate for the non-self-descriptive term included in the electronic document, wherein the qualified level of a first noun in the corpus data is calculated by; counting a number of occurrences (M) of the first noun in the corpus data; counting a number of times (Mb1) the first noun is qualified by a preposition in the corpus data; counting a number of times (Mb2) the first noun is qualified by a present or past participle in the corpus data; counting a number of times (Mb3) the first noun is qualified by a noun adjunct in the corpus data; summing mb1, Mb2 and Mb3 and dividing the sum by M to obtain the qualified level of the first noun in the corpus data.
-
Specification