System and method for main page identification in web decoding
First Claim
1. A method for communication analysis via a decoding processor, the method comprising the steps of:
- by the decoding processor, intercepting data communication packets exchanged between a user computer and one or more servers over a network during a web browsing session associated with a target user;
by the decoding processor, processing the packets so as to identify data elements viewed by the target user during the web browsing session;
by the decoding processor, for each identified data element, calculating a confidence level that the identified data element is main page;
by the decoding processor, determining identified data elements whose calculated confidence levels are below a threshold confidence level;
by the decoding processor, determining if a percentage of identified data elements whose calculated confidence levels are below the threshold confidence level exceeds a level; and
when it is determined that the percentage exceeds the level;
by the decoding processor, displaying the determined identified data elements to an operator on a display of an operator terminal in communication with the decoding processor, each of the determined identified data elements displayed as a separate web page with a request for feedback as to whether the displayed web page is a main web page;
receiving feedback from the operator via an input device of the operator terminal as to whether displayed data elements are main web pages that are displayed on their own in different pages in a web decoder or are embedded data elements that are displayed within a main web page and which are separately fetched from the main web page in which they are embedded and are not displayed on their own in different pages in the web decoder; and
by the decoding processor, determining which ones of determined identified data elements processed from packets accepted after receiving the feedback are to be displayed in the web decoder to the operator, responsive to the received feedback.
3 Assignments
0 Petitions
Accused Products
Abstract
Web pages may be rendered from a main page data element and a plurality of embedded data elements, which are separately fetched by a browser. Herein is provided a web decoder which includes a learning engine adapted to receive human indications of data elements which are unimportant and accordingly to adjust the web decoder'"'"'s procedures for determining which data elements are displayed to the user. The learning engine may receive human indications of important data elements and uses both types of indications in its further determinations. Optionally, rule generalizations are performed in a manner which searches for parameters which differentiate between important and unimportant data elements. The rule generalizations optionally concentrate on groups of data elements having at least a predetermined number of parameters having the same values for both important and unimportant data elements, reducing the chances that a generalization rule will find important data elements as unimportant.
27 Citations
18 Claims
-
1. A method for communication analysis via a decoding processor, the method comprising the steps of:
-
by the decoding processor, intercepting data communication packets exchanged between a user computer and one or more servers over a network during a web browsing session associated with a target user; by the decoding processor, processing the packets so as to identify data elements viewed by the target user during the web browsing session; by the decoding processor, for each identified data element, calculating a confidence level that the identified data element is main page; by the decoding processor, determining identified data elements whose calculated confidence levels are below a threshold confidence level; by the decoding processor, determining if a percentage of identified data elements whose calculated confidence levels are below the threshold confidence level exceeds a level; and when it is determined that the percentage exceeds the level; by the decoding processor, displaying the determined identified data elements to an operator on a display of an operator terminal in communication with the decoding processor, each of the determined identified data elements displayed as a separate web page with a request for feedback as to whether the displayed web page is a main web page; receiving feedback from the operator via an input device of the operator terminal as to whether displayed data elements are main web pages that are displayed on their own in different pages in a web decoder or are embedded data elements that are displayed within a main web page and which are separately fetched from the main web page in which they are embedded and are not displayed on their own in different pages in the web decoder; and by the decoding processor, determining which ones of determined identified data elements processed from packets accepted after receiving the feedback are to be displayed in the web decoder to the operator, responsive to the received feedback. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A communication analyzer, comprising:
-
a network interface configured to intercept data packets exchanged between a user computer and one or more servers over a network during a web browsing session associated with a target user; a display screen; an input device; a processor configured to identify data elements viewed by the target user during the web browsing session, to, for each identified data element, calculate a confidence level that the identified data element is main page, to determine identified data elements whose calculated confidence levels are below a threshold confidence level, to determine if a percentage of identified data elements whose calculated confidence levels are below the threshold confidence level exceeds a level, and when it is determined that the percentage exceeds the level; to display each of the determined identified data elements on the screen to an operator as a separate web page with a request for feedback as to whether the displayed web page is a main web page, to receive feedback via the input device from the operator as to whether displayed data elements are main web pages that are displayed on their own in different pages in a web decoder or are embedded data elements that are displayed within a main web page and which are separately fetched from the main web page within which they are embedded and are not displayed on their own in different pages in the web decoder, and to adjust its configuration for identifying data elements responsive to the received feedback. - View Dependent Claims (17, 18)
-
Specification