Document analysis apparatus, document analysis method, and computer-readable recording medium
First Claim
1. A document analysis apparatus comprising a computer with a central processing unit (CPU):
- the CPU being configured to function as a document collection acquisition unit which accepts an analysis object document to be an analysis object as a first document collection, and furthermore, accepts as an input a feature expression appearing during an attention period specified in advance in said first document collection, and for every said feature expression, acquires a collection of documents which have been issued, generated or updated during said attention period and in which said acquired feature expression has appeared, as a second document collection from among document collections including said first document collection;
the CPU being configured to function as a context determination unit which, for every said feature expression, specifies a document corresponding to said analysis object document as a first feature expression containing document, among documents of said second document collection in which the feature expression has appeared, and furthermore, specifies a context which is common in two or more said first feature expression containing documents as the context of the feature expression, among contexts in which the feature expression has appeared in said first feature expression containing document;
the CPU being configured to function as a context comparison determination unit which, for every said feature expression, specifies a document which does not correspond to said analysis object document as a second feature expression containing document, among documents of said second document collection in which the feature expression has appeared, and furthermore, performs comparison between a context in which the feature expression has appeared in said second feature expression containing document and a context which said CPU functioning as the context determination unit has specified; and
the CPU being configured to function as a feature degree setting unit which, based on a result of comparison by said CPU functioning as the context comparison determination unit, gives a feature degree to said feature expression, or corrects a feature degree in the case where a feature degree has been given to said feature expression in advance,wherein said CPU functioning as the context determination unit, after specifying said first feature expression containing document, determines, for every said feature expression, whether a relation between the number of said first feature expression containing documents and the number of documents in which the feature expression has appeared within said second document collection fulfills a setting condition, and specifies said context in the case where said setting condition is not fulfilled, andwherein said CPU functioning as the context comparison determination unit performs a comparison between a context in which the feature expression has appeared in said second feature expression containing document and a context which said CPU functioning as the context determination unit has specified, with respect to each said feature expression for which said context has been specified.
1 Assignment
0 Petitions
Accused Products
Abstract
A document analysis apparatus comprises: a feature expression acquisition unit acquiring a feature expression appearing during an attention period in an analysis object document collection; a document collection acquisition unit acquiring a feature expression containing document (FECD) collection in which a feature expression appears, from an analysis population including an analysis object document collection; a context determination unit specifying an analysis/FECD corresponding to an analysis object document among a FECD collection for every feature expression, and specifies a context in which the feature expression appeared in multiple analysis/FECDs; a context comparison determination unit specifying a non analysis/FECD not corresponding to an analysis object document among a FECD collection, and within that, compares a context in which the feature expression has appeared and a context specified previously; and a feature degree setting unit performing giving or the like of a feature degree to a feature expression from the comparison.
19 Citations
6 Claims
-
1. A document analysis apparatus comprising a computer with a central processing unit (CPU):
-
the CPU being configured to function as a document collection acquisition unit which accepts an analysis object document to be an analysis object as a first document collection, and furthermore, accepts as an input a feature expression appearing during an attention period specified in advance in said first document collection, and for every said feature expression, acquires a collection of documents which have been issued, generated or updated during said attention period and in which said acquired feature expression has appeared, as a second document collection from among document collections including said first document collection; the CPU being configured to function as a context determination unit which, for every said feature expression, specifies a document corresponding to said analysis object document as a first feature expression containing document, among documents of said second document collection in which the feature expression has appeared, and furthermore, specifies a context which is common in two or more said first feature expression containing documents as the context of the feature expression, among contexts in which the feature expression has appeared in said first feature expression containing document; the CPU being configured to function as a context comparison determination unit which, for every said feature expression, specifies a document which does not correspond to said analysis object document as a second feature expression containing document, among documents of said second document collection in which the feature expression has appeared, and furthermore, performs comparison between a context in which the feature expression has appeared in said second feature expression containing document and a context which said CPU functioning as the context determination unit has specified; and the CPU being configured to function as a feature degree setting unit which, based on a result of comparison by said CPU functioning as the context comparison determination unit, gives a feature degree to said feature expression, or corrects a feature degree in the case where a feature degree has been given to said feature expression in advance, wherein said CPU functioning as the context determination unit, after specifying said first feature expression containing document, determines, for every said feature expression, whether a relation between the number of said first feature expression containing documents and the number of documents in which the feature expression has appeared within said second document collection fulfills a setting condition, and specifies said context in the case where said setting condition is not fulfilled, and wherein said CPU functioning as the context comparison determination unit performs a comparison between a context in which the feature expression has appeared in said second feature expression containing document and a context which said CPU functioning as the context determination unit has specified, with respect to each said feature expression for which said context has been specified. - View Dependent Claims (2)
-
-
3. A document analysis method, comprising the steps of:
-
(a) accepting an analysis object document to be an analysis object as a first document collection, and furthermore, accepting as an input a feature expression which has appeared during an attention period specified in advance in said first document collection; (b) acquiring, as a second document collection, a collection of documents which have been issued, generated or updated during said attention period and in which said acquired feature expression has appeared, from among document collections including said first document collection for every said feature expression; (c) specifying, for every said feature expression, a document corresponding to said analysis object document as a first feature expression containing document among documents of said second document collection in which the feature expression has appeared, and furthermore, specifying a context which is common in two or more said first feature expression containing documents as the context of the feature expression, among contexts in which the feature expression has appeared in said first feature expression containing document; (d) specifying, for every said feature expression, a document which does not correspond to said analysis object document as a second feature expression containing document, among documents of said second document collection in which the feature expression has appeared, and furthermore, performing comparison between a context in which the feature expression has appeared in said second feature expression containing document and a context specified in said Step (c); and (e) based on a result of a comparison by said Step (d), giving a feature degree to said feature expression acquired by said Step (a) or correcting a feature degree in the case where the feature degree has been given to said feature expression in advance in said Step (a), wherein in said Step (c), after said first feature expression containing document is specified, for every said feature expression, it is determined whether a relation between the number of said first feature expression containing documents and the number of documents in which the feature expression has appeared within said second document collection fulfils a setting condition, and in the case where said setting condition is not fulfilled, specifying of said context is performed, and in said Step (d), with respect to each said feature expression for which said context has been specified, comparison between a context in which the feature expression has appeared in said second feature expression containing document and a context specified in said Step (c) is performed. - View Dependent Claims (4)
-
-
5. A non-transient computer-readable recording medium in which a program including instructions is recorded, the instructions making a computer execute the steps of:
-
(a) accepting an analysis object document to be an analysis object as a first document collection, and furthermore, accepting as an input a feature expression which has appeared during an attention period specified in advance in said first document collection; (b) acquiring, as a second document collection, a collection of documents which have been issued, generated or updated during said attention period and in which said acquired feature expression has appeared, from among document collections including said first document collection for every said feature expression; (c) specifying, for every said feature expression, a document corresponding to said analysis object document as a first feature expression containing document among documents of said second document collection in which the feature expression has appeared, and furthermore, specifying a context which is common in two or more said first feature expression containing documents as the context of the feature expression, among contexts in which the feature expression has appeared in said first feature expression containing document; (d) specifying, for every said feature expression, a document which does not correspond to said analysis object document as a second feature expression containing document, among documents of said second document collection in which the feature expression has appeared, and furthermore, performing comparison between a context in which the feature expression has appeared in said second feature expression containing document and a context specified in said Step (c); and (e) based on a result of a comparison by said Step (d), giving a feature degree to said feature expression acquired by said Step (a) or correcting a feature degree in the case where the feature degree has been given to said feature expression in advance in said Step (a), wherein in said Step (c), after said first feature expression containing document is specified, for every said feature expression, it is determined whether a relation between the number of said first feature expression containing documents and the number of documents in which the feature expressions has appeared within said second document collection fulfils a setting condition, and in the case where said setting condition is not fulfilled, specifying of said context is performed, and in said Step (d), with respect to each said feature expression for which said context has been specified, comparison between a context in which the feature expression has appeared in said second feature expression containing document and a context specified in said Step (c) is performed. - View Dependent Claims (6)
-
Specification