LINK DISCOVERY FROM WEB SCRIPTS
First Claim
1. A computer-implemented method for discovering links in a script, the method comprising:
- receiving webpages associated with one or more scripts;
processing the webpages to locate the one or more scripts;
accessing rules corresponding to the one or more scripts;
parsing the one or more scripts based on the rules corresponding to the one or more scripts;
identifying segments of the one or more scripts that satisfy the rules; and
evaluating the identified segments of the one or more scripts to generate links.
2 Assignments
0 Petitions
Accused Products
Abstract
A computer-implemented method, a computer system, and computer media for discovering links in scripts are provided. The computer system includes a crawler, a rules engine, and an index that are utilized to store links generated by scripts located in webpages in the index. The crawler traverses a network to locate webpages having scripts. The rules engine parses the located webpages and extracts the scripts based on rules that are satisfied by segments of the extracted scripts. The rules engine evaluates the segments of the extracted scripts to generate links. After the rules engine validates the links, the rules engine transmits the links to the index for storage.
141 Citations
20 Claims
-
1. A computer-implemented method for discovering links in a script, the method comprising:
-
receiving webpages associated with one or more scripts; processing the webpages to locate the one or more scripts; accessing rules corresponding to the one or more scripts; parsing the one or more scripts based on the rules corresponding to the one or more scripts; identifying segments of the one or more scripts that satisfy the rules; and evaluating the identified segments of the one or more scripts to generate links. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. One or more computer-readable media having computer-executable instructions embodied thereon that perform a method for generating an index that stores links discovered in scripts, the method comprising:
-
crawling a network to locate webpages; storing in an index metadata corresponding to each located webpage; parsing the located webpages to identify scripts associated with the located webpages; retrieving rules that check the identified scripts for variables, functions, or events that generate links; evaluating the variables, functions, or events to verify the validity of the generated links; and adding the generated links to the index when the generated links are verified. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15)
-
-
16. A computing system for discovering links in a script of a webpage, the system comprising:
-
a crawler accessing web pages to identify scripts associated with the webpages, wherein the webpages are HTML pages; a rules engine that parses the identified scripts and evaluates portions of the identified script based on rules that detect link-generating segments of the identified scripts, wherein the segments of the identified scripts are evaluated based on variables and expressions specified in the identified scripts and matching function patterns located in the segment and the rules to detect links generated by the identified scripts; and an index to store the detected links and metadata for the detected links. - View Dependent Claims (17, 18, 19, 20)
-
Specification