Method and system for log file analysis based on distributed computing network
First Claim
1. A method for log file analysis based on a distributed computing network system, characterized in that the method comprises:
- storing, in a log file, a plurality of user identifiers and log information associated with the plurality of user identifiers;
dividing the log file into a plurality of target files such that each of the plurality of target files includes log information associated with a user identifier of the plurality of user identifiers;
prior to dividing the log file into identifier files, filtering out information that is unrelated to log analysis from the log file and ordering it according to the time of the creation of the log information;
separately analyzing the plurality of target files to obtain analysis results using a plurality of nodes, the distributed computer network system including the plurality of nodes and a log analysis server; and
combining analysis results of the plurality of nodes,wherein dividing the log file into a plurality of target files comprises;
downloading the log file into the log analysis server of the distributed computing network;
sending the log file to the plurality of nodes by the log analysis server;
dividing by each node of the plurality of nodes the log file into identifier files according to the plurality of user identifiers;
placing the log information having a same user identifier into a same identifier file;
sending the identifier files to the log analysis server;
collecting by the log analysis server the identifier files sent from the plurality of nodes; and
combining identifier files having the same user identifier into a single file to form a corresponding target file.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention discloses a method and a system for log file analysis based on distributed computing network. The method includes: storing user identifiers and related log information into a log file; dividing the log file into target files each including the log information having the same user identifier; separately analyzing the target files to obtain analysis results using at least two nodes; and combining the analysis results of the nodes. The method thereby establishes relationships among various log files through user identifiers, and further analyzes the relationships among the user'"'"'s accesses to various contents of a website.
34 Citations
15 Claims
-
1. A method for log file analysis based on a distributed computing network system, characterized in that the method comprises:
-
storing, in a log file, a plurality of user identifiers and log information associated with the plurality of user identifiers; dividing the log file into a plurality of target files such that each of the plurality of target files includes log information associated with a user identifier of the plurality of user identifiers; prior to dividing the log file into identifier files, filtering out information that is unrelated to log analysis from the log file and ordering it according to the time of the creation of the log information; separately analyzing the plurality of target files to obtain analysis results using a plurality of nodes, the distributed computer network system including the plurality of nodes and a log analysis server; and combining analysis results of the plurality of nodes, wherein dividing the log file into a plurality of target files comprises; downloading the log file into the log analysis server of the distributed computing network; sending the log file to the plurality of nodes by the log analysis server; dividing by each node of the plurality of nodes the log file into identifier files according to the plurality of user identifiers; placing the log information having a same user identifier into a same identifier file; sending the identifier files to the log analysis server; collecting by the log analysis server the identifier files sent from the plurality of nodes; and combining identifier files having the same user identifier into a single file to form a corresponding target file. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system comprising:
-
one or more processors; and memory storing a log analysis unit executable by the one or more processors to; dividing a log file into a plurality of target files such that each of the plurality of target files includes log information associated with a user identifier of a plurality of user identifiers; prior to dividing the log file into identifier files, filtering out information that is unrelated to log analysis from the log file and ordering it according to the time of the creation of the log information; transmit multiple log files to a plurality of nodes of a computing network based on content of each log file of the multiple log files, the multiple log files including log information associated with the plurality of user identifiers, combine divided log files received from the plurality of nodes to obtain multiple target files each associated with a user identifier of the plurality of user identifiers, send the multiple target files to the plurality of nodes, and combine analysis results received from the plurality of nodes to obtain an analysis result for the multiple log files; and the plurality of nodes configured to; divide each of the multiple log files received from the log analysis unit to obtain the divided log files based on the plurality of user identifiers, placing the log information having a same user identifier into a same identifier file; send the divided log file to the log analysis unit, analyze the multiple target files to obtain the analysis results, and send the analysis results to the log analysis unit. - View Dependent Claims (8, 9, 10, 11, 12, 13, 14, 15)
-
Specification