Resisting the spread of unwanted code and data
First Claim
1. A method performed by a computer system for processing an electronic file, the method comprising the steps, performed by the computer system, of:
- identifying a portion of content data in the electronic file;
determining, by the computer system, if the identified portion of content data is passive content data having a fixed purpose or active content data having an associated function;
if the identified portion of content data is determined to be passive content data, then;
determining a file type or protocol of the portion of passive content data; and
determining whether the portion of passive content data is to be re-generated by determining if the passive content data conforms to a predetermined data format comprising a set of rules corresponding to the file type or protocol;
if the identified portion of content data is determined to be active content data, then analysing the portion of active content data to determine whether the portion of active content data is known good and therefore is to be re-generated; and
re-generating the portion of content data to create a re-generated electronic file, if the portion of content data is determined to be re-generated,wherein said step of analyzing a portion of active content data comprises;
(a) generating a hash for the portion of active content data, including normalizing the portion of active content data and generating a hash for the normalized portion of active content data;
(b) determining if the generated hash is present in a hash database of hashed normalized known good active content data; and
(c) determining that the portion of active content data is to be re-generated if it is determined in (b) that the generated hash is present in the hash database of normalized known known good active content data,wherein the method resists spread of unwanted code and data without scanning the electronic file for the unwanted code and data.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of processing an electronic file by identifying portions of content data in the electronic file and determining if each portion of content data is passive content data having a fixed purpose or active content data having an associated function. If a portion is passive content data, then a determination is made as to whether the portion of passive content data is to be re-generated. If a portion is active content data, then the portion is analysed to determine whether the portion of active content data is to be re-generated. A re-generated electronic file is then created from the portions of content data which are determined to be re-generated.
295 Citations
16 Claims
-
1. A method performed by a computer system for processing an electronic file, the method comprising the steps, performed by the computer system, of:
-
identifying a portion of content data in the electronic file; determining, by the computer system, if the identified portion of content data is passive content data having a fixed purpose or active content data having an associated function; if the identified portion of content data is determined to be passive content data, then; determining a file type or protocol of the portion of passive content data; and determining whether the portion of passive content data is to be re-generated by determining if the passive content data conforms to a predetermined data format comprising a set of rules corresponding to the file type or protocol; if the identified portion of content data is determined to be active content data, then analysing the portion of active content data to determine whether the portion of active content data is known good and therefore is to be re-generated; and re-generating the portion of content data to create a re-generated electronic file, if the portion of content data is determined to be re-generated, wherein said step of analyzing a portion of active content data comprises; (a) generating a hash for the portion of active content data, including normalizing the portion of active content data and generating a hash for the normalized portion of active content data; (b) determining if the generated hash is present in a hash database of hashed normalized known good active content data; and (c) determining that the portion of active content data is to be re-generated if it is determined in (b) that the generated hash is present in the hash database of normalized known known good active content data, wherein the method resists spread of unwanted code and data without scanning the electronic file for the unwanted code and data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. An apparatus for processing an electronic file, said apparatus comprising:
-
identifying means for identifying a portion of content data in the electronic file; content determining means for determining if the identified portion of content data is passive content data having a fixed purpose or active content data having an associated function; portion determining means for determining a file type or protocol of a portion of passive content data, and for determining whether the portion of passive content data is to be re-generated by determining if the passive content data conforms to a predetermined data format comprising a set of rules corresponding to the file type or protocol; analyzing means for analyzing a portion of active content data to determine whether a portion of active content data is known good and therefore is to be re-generated, if the identified portion of content data is determined to be active content data; re-generating means for re-generating the portion of content data to create a re-generated electronic file, if the portion of content data is determined to be re-generated; a hash database of hashed normalized known good active content data; active content normalizing means for normalizing the active content data; hash generating means for generating a hash for the portion of the normalized active content data; and hash determination means for determining if the generated hash is present in a hash database of normalized known good active content data; and wherein the analyzing means determines that the portion of active content data is to be re-generated if it is determined that the generated hash is present in the hash database of normalized known good active content data, and wherein the apparatus is arranged to resist spread of unwanted code and data without scanning the electronic file for the unwanted code and data.
-
Specification