System and methods for cleansing automated robotic traffic from sets of usage logs
First Claim
1. A method of cleansing data generated by one or more servers in response to database interactions resulting from an automated software robot interacting with the one or more servers via a network, the method comprising:
- receiving a first set of requests for one or more sets of data;
retrieving metadata from a metadata database based on the first set of requests;
embedding a link in the metadata to at least one full record associated with the metadata, the at least one full record being stored in a source database;
receiving a second set of requests in response to actuation of the link embedded in the metadata;
in response to receiving the second set of requests, retrieving from the source database, the at least one full record associated with the metadata;
capturing, in a plurality of usage logs, each data structure executed in response to processing the first and second sets of requests;
retrieving the plurality of usage logs from a non-transitory computer-readable medium, each of the plurality of usage logs including log entries corresponding to events that occurred during a session between a user device and the one or more servers, wherein the events include the second set of requests;
processing the log entries in each of the plurality of usage logs in response to execution of a log analyzer to determine a relationship between the events that occurred during each session;
executing the log analyzer to classify the plurality of usage logs based on the relationship as either corresponding to human behavior or automated software robot behavior; and
in response to classification of one or more of the plurality of usage logs as corresponding to the automated software robot behavior, excluding the one or more of the plurality of usage logs from generation of a metric,wherein processing the log entries to determine the relationship includes measuring an intentionality associated with the events corresponding to the log entries based on determining, from the log entries, a quantity of search requests that were submitted during the session which did not result in a payoff event,wherein the metric is generated based on one or more of the plurality of usage logs that are classified as corresponding to human behavior, andwherein the one or more servers utilize the metric to adjust subsequent discovery of the metadata in the metadata database in response to search queries from each user device.
11 Assignments
0 Petitions
Accused Products
Abstract
Exemplary embodiments of the present disclosure provide for cleansing data generated by one or more servers in response to database interactions resulting from an automated software robot interacting with the one or more servers via a telecommunications network. Log entries in usage logs corresponding to events during a session can be analyzed to determine relationships between events and the usage logs can be classified based on the relationships as either corresponding to human behavior or automated software robot behavior. Usage logs corresponding to automated software robot behavior can be removed from further analysis.
-
Citations
19 Claims
-
1. A method of cleansing data generated by one or more servers in response to database interactions resulting from an automated software robot interacting with the one or more servers via a network, the method comprising:
-
receiving a first set of requests for one or more sets of data; retrieving metadata from a metadata database based on the first set of requests; embedding a link in the metadata to at least one full record associated with the metadata, the at least one full record being stored in a source database; receiving a second set of requests in response to actuation of the link embedded in the metadata; in response to receiving the second set of requests, retrieving from the source database, the at least one full record associated with the metadata; capturing, in a plurality of usage logs, each data structure executed in response to processing the first and second sets of requests; retrieving the plurality of usage logs from a non-transitory computer-readable medium, each of the plurality of usage logs including log entries corresponding to events that occurred during a session between a user device and the one or more servers, wherein the events include the second set of requests; processing the log entries in each of the plurality of usage logs in response to execution of a log analyzer to determine a relationship between the events that occurred during each session; executing the log analyzer to classify the plurality of usage logs based on the relationship as either corresponding to human behavior or automated software robot behavior; and in response to classification of one or more of the plurality of usage logs as corresponding to the automated software robot behavior, excluding the one or more of the plurality of usage logs from generation of a metric, wherein processing the log entries to determine the relationship includes measuring an intentionality associated with the events corresponding to the log entries based on determining, from the log entries, a quantity of search requests that were submitted during the session which did not result in a payoff event, wherein the metric is generated based on one or more of the plurality of usage logs that are classified as corresponding to human behavior, and wherein the one or more servers utilize the metric to adjust subsequent discovery of the metadata in the metadata database in response to search queries from each user device. - View Dependent Claims (2, 3, 4, 5, 9, 10, 11, 12, 13)
-
-
6. A method of cleansing data generated by one or more servers in response to database interactions resulting from an automated software robot interacting with the one or more servers via a communications network, the method comprising:
-
receiving a first set of requests for one or more sets of data; retrieving from metadata from a metadata database based on the first set of requests; embedding a link in the metadata to at least one full record associated with the metadata, the at least one full record being stored in a source database; receiving a second set of requests in response to actuation of the link embedded in the metadata; in response to receiving the second set of requests, retrieving from the source database the at least one full record associated with the metadata; capturing, in a plurality of usage logs, each data structure executed in response to processing the first and second set of requests; retrieving a plurality of usage logs from a non-transitory computer-readable medium, each of the plurality of usage logs including log entries corresponding to events that occurred during a session between a user device and the one or more servers, wherein the events include the second requests; processing the log entries in each of the plurality of usage logs in response to execution of a log analyzer to determine a relationship between the events that occurred during each session; executing the log analyzer to classify the plurality of usage logs based on the relationship as either corresponding to human behavior or automated software robot behavior; and in response to classification of one or more of the plurality of usage logs as corresponding to the automated software robot behavior excluding the one or more of the plurality of usage logs from generation of a metric, wherein processing the log entries to determine the relationship includes measuring an intentionality associated with the events corresponding to the log entries based on determining, from the log entries, a quantity of search requests that were submitted during the session which did not result in a payoff event, wherein the metric is generated based on one or more of the plurality of usage logs that are classified as corresponding to human behavior, wherein the one or more servers utilize the metric to adjust subsequent discovery of the metadata in the metadata database in response to search queries from each user device, and wherein determining the intentionality of at least one of the plurality of usage logs further comprises; determining a natural log of a quantity of search requests that were submitted during the session which did not result in a payoff event; and multiplying the natural log of the quantity of search requests by a multiplication factor. - View Dependent Claims (7, 8)
-
-
14. A system for cleansing data generated by one or more servers in response to database interactions resulting from an automated software robot interacting with the one or more servers via a network, the system comprising:
-
one or more servers; a metadata database in communication with the one or more servers, the metadata database configured to store metadata; a source database in communication with the one or more servers, the source database configured to store full records associated with the metadata stored in the metadata database; and a usage log database including usage logs associated with sessions between user devices and the one or more servers, the usage logs including log entries corresponding to events that occurred during sessions between user devices and the one or more servers, wherein the one or more servers are programmed to; receive a first set of requests; retrieve the metadata from the metadata database based on the first set of requests; embed a link in the metadata to at least one of the full records in the source database; receive a second set of requests in response to actuation of the links in embedded in each of the metadata; in response to receiving the second set of requests, retrieve, from the source database, the at least one of the full records associated with the metadata; capture, in a plurality of usage logs, each data structure executed in response to processing the first and second sets of requests, wherein the events include the second set of requests; retrieve the usage logs from usage log database; process the log entries in response to execution of a log analyzer to determine a relationship between the events that occurred during the session; execute the log analyzer to classify each of the usage logs based on the relationship as either corresponding to human behavior or automated software robot behavior; and exclude one or more of the usage logs from generation of a metric in response to classification of the one or more usage logs as corresponding to the automated software robot behavior, wherein process the log entries to measure the relationship includes determining an intentionality associated with the events corresponding to the log entries based on determining, from the log entries, a quantity of search requests that were submitted during the session which did not result in a payoff event, wherein the metric is generated based on one or more of the usage logs that are classified as corresponding to human behavior, and wherein the one or more servers utilize the metric to adjust subsequent discovery of the metadata in the metadata database in response to search queries from each user device. - View Dependent Claims (15, 16, 18)
-
-
17. A system for cleansing data generated by one or more servers in response to database interactions resulting from an automated software robot interacting with the one or more servers via a network, the system comprising:
-
one or more servers; a metadata database in communication with the one or more servers, the metadata database configured to store metadata; a source database in communication with the one or more servers, the source database configured to store full records associated with the metadata in the metadata database; and a usage log database including usage logs associated with sessions between user devices and the one or more servers, the usage logs including log entries corresponding to events that occurred during sessions between user devices and the one or more servers, wherein the one or more servers are programmed to; receive a first set of requests for one or more sets of data; retrieve the metadata from the metadata database based on the first set of requests; embed a link in the metadata to at least one of the full records associated with the metadata; receive a second set of requests in response to actuation of the link embedded in the metadata; in response to receiving the second set of requests, retrieve from the source database the at least one of the full records associated with the metadata; capture, in a plurality of usage logs, each data structure executed in response to processing the first and second sets of requests, wherein the events include the second set of requests; retrieve the usage logs from usage log database; process the log entries in response to execution of a log analyzer to determine a relationship between the events that occurred during the session; execute the log analyzer to classify each of the usage logs based on the relationship as either corresponding to human behavior or automated software robot behavior; and exclude one or more of the usage logs from generation of a metric in response to classification of the one or more usage logs as corresponding to the automated software robot behavior, wherein process the log entries to measure the relationship includes determining an intentionality associated with the events corresponding to the log entries based on determining, from the log entries, a quantity of search requests that were submitted during the session which did not result in a payoff event, wherein the metric is generated based on one or more of the usage logs that are classified as corresponding to human behavior, and wherein the one or more servers utilize the metric to adjust subsequent discovery of the metadata in the metadata database in response to search queries from each user device, wherein the one or more servers are programmed to executing code to evaluate the following mathematical expression;
-
-
19. A non-transitory computer-readable medium storing instructions, wherein execution of the instructions by a processing device causes the processing device to:
-
receive a first set of requests for one or more sets of data; retrieve metadata from a metadata database based on the first set of requests; embed a link in the metadata to at least one full record associated with the metadata; receive a second set of requests in response to actuation of the links in embedded in the metadata; in response to receiving the second set of requests, retrieve, from a source database, the at least one full record; capture each data structure executed in response to processing the first and second sets of requests in a plurality of usage logs; retrieve the usage logs from usage log database; process the log entries in response to execution of a log analyzer to determine a relationship between events that occurred during the session, wherein the events include the second set of requests; execute the log analyzer to classify each of the usage logs based on the relationship as either corresponding to human behavior or automated software robot behavior; and exclude one or more of the usage log from generation of a metric in response to classification of the usage log as corresponding to the automated software robot behavior, wherein process the log entries to measure the relationship includes determining an intentionality associated with the events corresponding to the log entries based on determining, from the log entries, a quantity of search requests that were submitted during the session which did not result in a payoff event, wherein the metric is generated based on one or more of the plurality of usage logs that are classified as corresponding to human behavior, and wherein the one or more servers utilize the metric to adjust subsequent discovery of the metadata in the metadata database in response to search queries from each user device.
-
Specification