METHODS AND SYSTEMS FOR ANALYZING READING LOGS AND DOCUMENTS THEREOF
First Claim
1. A method for analyzing reading log and documents corresponding thereto, comprising:
- acquiring a reading log and documents corresponding thereto, wherein the reading log at least includes reading-related information about the documents within a predetermined period of time;
selecting a plurality of interesting document sets from the documents in each time segment of the predetermined period of time according to the reading log, each of the interesting document sets corresponding to one of the time segments of the predetermined period of time;
performing a document content pre-processing on the interesting document sets to determine keyword sets corresponding to the interesting document sets;
performing a cluster calculation on the keyword sets to obtain topics and calculating cohesion of each topic;
deleting topics with insufficient cohesion among the topics obtained to obtain a plurality of high-relevance topics and classifying each high-relevance topic into one of a plurality of predetermined topic classes by comparing the respective keyword sets of the high-relevance topics with a plurality of keyword sets of the predetermined topic classes;
obtaining reading statistics for each predetermined topic class and calculating a plurality of degrees of interest for each predetermined topic class during each time segment; and
analyzing a reading trend on each predetermined topic class according to changes in the degrees of interest.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods for analyzing reading log and documents corresponding thereof are provided, including: acquiring reading log and documents corresponding thereto, wherein the reading log at least includes reading-related information about the documents within a predetermined period of time, selecting interesting document sets from the documents according to the reading log in each time segment, performing a document content pre-processing on the interesting document sets to determine keyword sets corresponding thereto for each time segment according to the interesting document sets, performing cluster calculation on the keyword sets to obtain topics and calculating cohesion of each topic, deleting topics with insufficient cohesion to obtain multiple high-relevance topics and classifying each high-relevance topic into one of predetermined topic classes according to the respective keyword sets of the high-relevance topics, obtaining reading statistics for each topic class and calculating multiple degrees of interest for each topic class during each time segment.
-
Citations
18 Claims
-
1. A method for analyzing reading log and documents corresponding thereto, comprising:
-
acquiring a reading log and documents corresponding thereto, wherein the reading log at least includes reading-related information about the documents within a predetermined period of time; selecting a plurality of interesting document sets from the documents in each time segment of the predetermined period of time according to the reading log, each of the interesting document sets corresponding to one of the time segments of the predetermined period of time; performing a document content pre-processing on the interesting document sets to determine keyword sets corresponding to the interesting document sets; performing a cluster calculation on the keyword sets to obtain topics and calculating cohesion of each topic; deleting topics with insufficient cohesion among the topics obtained to obtain a plurality of high-relevance topics and classifying each high-relevance topic into one of a plurality of predetermined topic classes by comparing the respective keyword sets of the high-relevance topics with a plurality of keyword sets of the predetermined topic classes; obtaining reading statistics for each predetermined topic class and calculating a plurality of degrees of interest for each predetermined topic class during each time segment; and analyzing a reading trend on each predetermined topic class according to changes in the degrees of interest. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system for analyzing reading log and documents corresponding thereto, comprising:
-
a reading log extractor, acquiring a reading log and documents corresponding thereto, wherein the reading log at least includes reading-related information about the documents within a predetermined period of time; an interesting document filter coupled to the reading log extractor, selecting a plurality of interesting document sets from the documents in each time segment of the predetermined period of time according to the reading log, each of the interesting document sets corresponding to one of the time segments of the predetermined period of time; a document pre-processor coupled to the interesting document filter, performing a document content pre-processing on the interesting document sets to determine keyword sets corresponding to the interesting document sets; a topic cluster generator coupled to the document pre-processor, performing a cluster calculation on the keyword sets to obtain topics, calculating cohesion of each topic and deleting topics with insufficient cohesion among the topics obtained to obtain a plurality of high-relevance topics; a topic classifier and combiner coupled to the topic cluster generator, classifying each high-relevance topic into one of a plurality of predetermined topic classes by comparing the respective keyword sets of the high-relevance topics with a plurality of keyword sets of the predetermined topic classes; a degree of interest normalizer coupled to the topic classifier and combiner, obtaining reading statistics for each predetermined topic class and calculating a plurality of degrees of interest for each predetermined topic class during each time segment; and a reading trend analyzer coupled to the degree of interest normalizer, analyzing a reading trend on each predetermined topic class according to changes in the degrees of interest. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
Specification