System and method for preventing automated crawler access to web-based data sources using a dynamic data transcoding scheme
First Claim
1. A system for preventing automated crawler access to data from a network-based data source, comprising:
- a transcoding proxy for automatically permutating data retrieved from the data source, to render the data uninterpretable by the crawler, while allowing a browser to render data retrieved from the data source;
wherein the transcoding proxy utilizes a transcoding technique that dynamically changes a form structure;
wherein the transcoding technique changes the form structure by shifting the position of the data in the form;
wherein the transcoding technique shifts the position of form data by inserting any one or more of;
a character, a string of characters, a space, non-textual data, a blank row, a blank column, and/or an empty table; and
wherein the transcoding technique shifts the position of form data by nesting the data in a table within a table.
1 Assignment
0 Petitions
Accused Products
Abstract
A protection system and associated method prevent the automatic crawler access to a company'"'"'s web-based data, without impacting the ability of an interactive user, such as a consumer, to access the data and to conduct regular business transactions. In one embodiment, the protection system will not prevent the crawler from downloading data; rather, the data will be rendered non-extractable by the crawler. In another embodiment, the protection system will prevent crawler access to the data. To this end, the protection system uses any one or a combination of the following six transcoding techniques: Transcoding technique that changes the web page structure; transcoding technique that changes the web page content; transcoding technique that selectively changes web page variable names; transcoding technique that selectively converts text to images in the web page; transcoding technique that alters form values when executed; and/or transcoding technique that generates a substantial portion of, or the entire web page when executed.
120 Citations
18 Claims
-
1. A system for preventing automated crawler access to data from a network-based data source, comprising:
-
a transcoding proxy for automatically permutating data retrieved from the data source, to render the data uninterpretable by the crawler, while allowing a browser to render data retrieved from the data source; wherein the transcoding proxy utilizes a transcoding technique that dynamically changes a form structure; wherein the transcoding technique changes the form structure by shifting the position of the data in the form; wherein the transcoding technique shifts the position of form data by inserting any one or more of;
a character, a string of characters, a space, non-textual data, a blank row, a blank column, and/or an empty table; andwherein the transcoding technique shifts the position of form data by nesting the data in a table within a table. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system for preventing automated crawler access to data from a network-based data source, comprising:
-
a transcoding proxy for automatically permutating data retrieved from the data source, to render the data uninterpretable by the crawler, while allowing a browser to render data retrieved from the data source; wherein the transcoding proxy utilizes a transcoding technique that dynamically changes a form content; and wherein the transcoding technique changes the form content by adding one or more content inserts that are substantially imperceptible to a browser user, in order to render selected terms in the page content difficult to be searched automatically by the crawler. - View Dependent Claims (12, 13, 14)
-
-
15. A method for preventing automated crawler access to data from a network-based data source, comprising the steps of:
-
automatically permutating data retrieved from the data source, to prevent a crawler from automatically accessing the data source by rendering the data uninterpretable by the crawler, while allowing a browser to render data retrieved from the data source; and utilizing a transcoding technique that dynamically changes a form structure; wherein the transcoding technique changes the form structure by shifting the position of the data in the form; and wherein the transcoding technique shifts the position of form data by nesting the data in a table within a fable. - View Dependent Claims (16)
-
-
17. A system for preventing automated crawler access to data from a network-based data source, comprising:
-
a transcoding proxy for automatically permutating data retrieved from the data source, to render the data uninterpretable by the crawler, while allowing a browser to render data retrieved from the data source; wherein the transcoding proxy utilizes a transcoding technique that dynamically changes a form structure; wherein the transcoding technique changes the form structure by shifting the position of the data in the form; and wherein the transcoding technique shifts the position of form data by nesting the data in a table within a table.
-
-
18. A computer program product including a plurality of executable instruction codes stored on a computer readable medium, for preventing automated crawler access to data from a network-based data source, comprising:
-
a first set of instruction codes for automatically permutating data retrieved from the data source, to render the data uninterpretable by the crawler, while allowing a browser to render data retrieved from the data source; a second set of instruction codes for dynamically changing a form structure; wherein the second set of instruction codes changes the form structure by shifting the position of the data in the form; and wherein the second set of instruction codes shifts the position of form data by nesting the data in a table within a table.
-
Specification