System and method for analyzing web content
First Claim
1. A method of classifying web content, implemented on one or more computer processors, the method comprising:
- using at least one of the processors, receiving content of at least one web page;
using at least one of the processors, determining a first property of a plurality of properties associated with the web page at least in part by identifying a keyword in the content of the web page, the first property comprising static content associated with the web page,using at least one of the processors, determining a second property of the plurality of properties at least in part by executing active content associated with the webpage in a sandbox environment, the second property comprising the active content;
using at least one of the processors, storing the first property and the second property in a database of web page properties;
using at least one of the processors, comparing at least one definition from a definitions database to the first and second properties, wherein a particular definition contains a logical expression relating the first property and the second property;
using at least one of the processors, associating the web page with at least one category, the category associated with the particular definition, wherein said category is indicative of the active content associated with the web page;
wherein the logical expression comprises at least one term comprising a relationship of at least one web page property to at least one other value; and
wherein the at least one other value comprises a constant value matching at least a portion of the content of the web page.
23 Assignments
0 Petitions
Accused Products
Abstract
A system and method are provided for identifying active content in websites on a network. One embodiment includes a method of classifying web content. In one embodiment, the classifications are indicative of active and/or malicious content. The method includes identifying properties associated with the web page based at least partly on the content of the web page and storing said properties in a database of web page properties. The method further includes comparing at least one definition to properties stored in the database of web page properties and identifying the web page with at least one definition based on comparing said definition with said stored properties. The method further includes identifying the web page with at least one category associated with the at least one definition, wherein said category is indicative of active content associated with the web page. Other embodiments include systems configured to perform such methods.
-
Citations
22 Claims
-
1. A method of classifying web content, implemented on one or more computer processors, the method comprising:
-
using at least one of the processors, receiving content of at least one web page; using at least one of the processors, determining a first property of a plurality of properties associated with the web page at least in part by identifying a keyword in the content of the web page, the first property comprising static content associated with the web page, using at least one of the processors, determining a second property of the plurality of properties at least in part by executing active content associated with the webpage in a sandbox environment, the second property comprising the active content; using at least one of the processors, storing the first property and the second property in a database of web page properties; using at least one of the processors, comparing at least one definition from a definitions database to the first and second properties, wherein a particular definition contains a logical expression relating the first property and the second property; using at least one of the processors, associating the web page with at least one category, the category associated with the particular definition, wherein said category is indicative of the active content associated with the web page; wherein the logical expression comprises at least one term comprising a relationship of at least one web page property to at least one other value; and wherein the at least one other value comprises a constant value matching at least a portion of the content of the web page. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A system for classifying web content, the system comprising:
-
a database configured to store properties associated with web pages; one or more processing devices configured, individually or in combination, to; determine a first property of a plurality of properties of a web page, at least in part by identifying a keyword in the content of the web page, the first property comprising static content associated with the web page; determine a second property of the plurality of properties, at least in part by executing active content associated with the webpage in a sandbox environment, the second property comprising the active content; store said plurality of properties in said database of web page properties; compare at least one definition from a definitions database to the first and second properties, wherein a particular definition contains a logical expression relating the first property and the second property; associate the web page with at least one category, the category associated with the particular definition, wherein said category is indicative of active content associated with the web page; wherein the logical expression comprises at least one term comprising a relationship of at least one web page property to at least one other value; and wherein the at least one other value comprises a constant value. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22)
-
Specification