Methods and systems for scanning and monitoring content on a network
First Claim
Patent Images
1. A method for scanning and monitoring content on a computer network, the method comprising:
- scanning the network to identify network resources on which the relevant content appears and the location of said resources on the network;
scanning each of the identified network locations to determine any network address information, said network address information identifying the computer or computer system related to that network location;
resolving each of said identified network locations by network address information into one or more network addresses;
profiling said network resources by classifying the content available at said resource;
acquiring said relevant content, said acquiring including copying said network resource and said content to another computer system;
analyzing said network resource by breaking up the content at that network resource into one or more constituent elements;
identifying and analyzing any links on a network resource to any other network resource and categorizing said links into local, neighbor, or remote links based on their relationship with the current resource being analyzed;
identifying broken links, in which the network resource linked to is not available;
keyword scanning of content on said network resource for predetermined keywords or phrases and determining whether said keywords are present on the network resource being analyzed;
analyzing content for language patterns to obtain instances of information such as street addresses, phone numbers, email addresses and other pattern-defined items;
fingerprinting content on said network resource to obtain a quantitative fingerprint for at least one of said constituent elements of said content; and
fingerprinting said network location to obtain a single quantitative measurement for all of said network resources and content at said network location.
7 Assignments
0 Petitions
Accused Products
Abstract
The invention addressed methods and systems of scanning and monitoring network locations and resources such as websites and webpages. The method scans the webpages for content and compliance with predefined standards, and reports the results in a non-technical format. The method can also scan and monitor any website on the Internet, categorize its content, and validate its compliance with application, presentation or content standards, or any combination of the above.
203 Citations
40 Claims
-
1. A method for scanning and monitoring content on a computer network, the method comprising:
-
scanning the network to identify network resources on which the relevant content appears and the location of said resources on the network;
scanning each of the identified network locations to determine any network address information, said network address information identifying the computer or computer system related to that network location;
resolving each of said identified network locations by network address information into one or more network addresses;
profiling said network resources by classifying the content available at said resource;
acquiring said relevant content, said acquiring including copying said network resource and said content to another computer system;
analyzing said network resource by breaking up the content at that network resource into one or more constituent elements;
identifying and analyzing any links on a network resource to any other network resource and categorizing said links into local, neighbor, or remote links based on their relationship with the current resource being analyzed;
identifying broken links, in which the network resource linked to is not available;
keyword scanning of content on said network resource for predetermined keywords or phrases and determining whether said keywords are present on the network resource being analyzed;
analyzing content for language patterns to obtain instances of information such as street addresses, phone numbers, email addresses and other pattern-defined items;
fingerprinting content on said network resource to obtain a quantitative fingerprint for at least one of said constituent elements of said content; and
fingerprinting said network location to obtain a single quantitative measurement for all of said network resources and content at said network location. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A method for scanning and monitoring a website on the Internet, comprising:
-
scanning web pages at said site;
summarizing of the website by identifying the content, name, title, number of predetermined keyword fits and categories of at least one webpage on said site; and
summarizing technical information about said site, including identifying the vendors information and IP addresses of any web servers on which said website runs. - View Dependent Claims (20, 21, 22)
-
-
23. The method for scanning and monitoring one or more websites associated with an enterprise, comprising:
-
detecting the presence of forbidden content on any of said websites;
validating the presence of required content on one or more webpages on one or more of said websites; and
inventorying and profiling any pages on any of said websites, wherein said inventorying comprises archiving and storing all of the webpages on all of the websites associated with the organization, wherein said forbidden content and said required content are determined prior to the initiation of the scanning of said websites. - View Dependent Claims (24)
-
-
25. A system for scanning content on a network comprising:
-
a tasking interface;
a tasking database;
traffic acquisition modules;
local data storage;
analysis infrastructure modules;
an event database; and
a reporting analysis interface;
wherein said tasking interface accepting input relating to said scanning, wherein the tasking database contains information regarding the scheduling and configuration of the traffic acquisition modules to acquire data traffic from the network, wherein the data acquired by the traffic acquisition modules is stored in the raw data storage, wherein the analysis infrastructure modules analyze content stored in the raw data storage, wherein the output of the content analysis infrastructure modules is stored in the event database with information regarding action to be taken based on content that is detected. - View Dependent Claims (26, 27, 28, 29, 30, 31, 32, 33)
-
-
34. A method of scanning a target network resource on a network, comprising:
-
determining a network address of the network resource;
scanning the network for network resources related to said target network resource; and
analyzing content of any network resource found at an address related to the network address of the target network resource. - View Dependent Claims (35, 36, 37, 38, 39, 40)
-
Specification