Apparatus and method for presenting document data
First Claim
Patent Images
1. A document data presentation apparatus, comprising:
- a document feature information extraction unit calculating a number of occurrences of each of some words in document data and extracting and synthesizing document feature information composed of words and the number of occurrences of each word;
a classification unit classifying document data into respective fields by comparing a preliminarily calculated field property vector of each of a plurality of fields and a text property vector corresponding to the document feature information; and
a document information presentation unit extracting document data whose generation date is in a predetermined period in a same field or which contains a date in the predetermined period in a same field and displaying a number of extracted document data as a time series graph, and simultaneously displaying document feature information whose number of occurrences of each word in a designated period exceeds a predetermined value when a user designates an arbitrary period of a graph.
1 Assignment
0 Petitions
Accused Products
Abstract
A word id extracted from dated document data, and the number of occurrences of the word can be obtained by adding up the number of the extracted word for each field and period. Furthermore, a word indicating a large number of occurrences in each field and period is extracted as a characteristic word. When a user specifies a field and a period, characteristic words in the document data in the specified period are displayed. When the user selects a specific characteristic word, the header, etc. of the document data containing the characteristic word is displayed.
-
Citations
23 Claims
-
1. A document data presentation apparatus, comprising:
-
a document feature information extraction unit calculating a number of occurrences of each of some words in document data and extracting and synthesizing document feature information composed of words and the number of occurrences of each word;
a classification unit classifying document data into respective fields by comparing a preliminarily calculated field property vector of each of a plurality of fields and a text property vector corresponding to the document feature information; and
a document information presentation unit extracting document data whose generation date is in a predetermined period in a same field or which contains a date in the predetermined period in a same field and displaying a number of extracted document data as a time series graph, and simultaneously displaying document feature information whose number of occurrences of each word in a designated period exceeds a predetermined value when a user designates an arbitrary period of a graph. - View Dependent Claims (2, 3)
-
-
4. A document data presentation apparatus, comprising:
-
a document feature information extraction unit calculating a number of occurrences of some words in document data and extracting and synthesizing document feature information composed of the words and the number of occurrences of each word;
a field classification unit calculating similarity between a text property vector corresponding to document feature information extracted from the document feature information extraction unit and a preliminarily calculated field property vector corresponding to document data whose field is specified and classifying document data whose similarity exceeds a predetermined value into the field; and
a document information presentation unit extracting document data whose generation is in a predetermined period or which contains a date in the predetermined period in a same field and displaying a number of extracted document data as a time series trend graph, and simultaneously displaying document feature information whose number of occurrences of each word in a designated period exceeds a predetermined value when a user designates an arbitrary period of the graph. - View Dependent Claims (5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A document data presentation apparatus, comprising:
-
a user document designation history information storage unit storing user feature information of a user as associated with document identification information indicating document data referenced by the user;
a classification unit classifying the document data into respective fields by comparing a preliminarily calculated field property vector for each of a plurality of fields and a text property vector corresponding to the user feature information of the user that has referred to the document data;
a document feature information extraction unit extracting and synthesizing document feature information composed of some words and a number of occurrences of each word from among a set of document data whose generation date is in a predetermined period in a same field of which contains a date in the predetermined period in a same field; and
a document information presentation unit displaying a number of document data in a predetermined period as a time series trend graph, and simultaneously displaying document feature information whose number of occurrences of each word in a designated period exceeds a predetermined value when a user designates an arbitrary period of the graph.
-
-
15. A document data presentation system in which a document data presentation apparatus is connected to a display device for displaying document data through communications circuit, wherein:
said document data presentation apparatus comprises;
a document feature information extraction unit calculating a number of occurrences of each of some words in document data and extracting and synthesizing document feature information composes of the words and the number of occurrences of each word;
a classification unit classifying document data into respective fields by comparing a preliminarily calculated field property vector of each of a plurality of fields and a text property vector corresponding to the document feature data; and
a transmission unit extracting document data whose generation date is in a predetermined period in a same field or which contains a date in the predetermined period in a same field and displaying extracted document data as a time series trend graph, and simultaneously transmitting document feature information whose number of occurrences of each word in a designated period exceeds a predetermined value when a user designates an arbitrary period of the graph; and
said display device comprises a reception unit receiving information from the said document data presentation apparatus; and
a display control unit displaying a number of extracted document data in a predetermined period in a same field as a time series trend graph, and simultaneously displaying document feature information whose number of occurrences of each word in a designated period exceeds a predetermined value when an arbitrary period of the graph is designated. - View Dependent Claims (18)
-
16. A document data presenting method comprising:
-
calculating a number of occurrences of each of some words in document data and extracting and synthesizing document feature information composed of the words and the number of occurrences of each word;
classifying document data into respective fields by comparing a preliminarily calculated field property vector of each of a plurality of fields and a text property vector corresponding to the document feature information; and
extracting document data whose generation date is in a predetermined period in a same field or which contains a date in the predetermined period in a same field and displaying a number of extracted document data as a time series trend graph, and simultaneously displaying document feature information whose number of occurrences of each word in a designated period when a user designates an arbitrary period of the graph. - View Dependent Claims (17, 19, 20)
-
-
21. Computer-readable storage medium storing a program for performing a process, said process comprising:
-
calculating a number of occurrences of each of some words in document data and extracting and synthesizing document feature information composed of the words and the number of occurrences of each word;
classifying document data into respective fields by comparing a preliminarily calculated field property vector of each of a plurality of fields and a text property vector corresponding to the document feature information; and
extracting document data whose generation date is in a predetermined period in a same field or which contains a date in the predetermined period in a same field and displaying a number of extracted document data as a time series trend graph, and simultaneously displaying document feature information whose number of occurrences of each word in a designated period exceeds a predetermined value when a user designates an arbitrary period of the graph. - View Dependent Claims (22, 23)
-
Specification