Method of treating whitespace during virus detection
First Claim
1. A method of searching a text-based computer file for a computer virus known to infect text-based files using a stored sequence of computer-readable characters associated with the computer virus, comprising the steps of:
- transforming whitespace within the text-based file in accordance with a whitespace transformation rule to form a transformed text-based file;
transforming whitespace within the stored sequence of computer-readable characters in accordance with said whitespace transformation rule to form a transformed sequence of computer-readable characters; and
searching said transformed text-based file for at least one occurrence of said transformed sequence of computer-readable characters, wherein the computer virus is detected upon an identification of at least one such occurrence.
10 Assignments
0 Petitions
Accused Products
Abstract
A method is provided for detecting computer viruses that infect text-based files. In accordance with a preferred embodiment, a collection of virus signatures reflecting sequences of characters or instructions known to be found in such viruses is maintained on a computer system. A virus detection program is also maintained for the purpose of comparing the contents of computer files to the virus signatures. Upon execution of the virus detection program, whitespace within text-based files is transformed such that each sequence of whitespace characters is replaced by a single whitespace character. Virus signatures of viruses known to infect text files are similarly transformed. A transformed text-based file is then searched for at least one of said virus signatures. The user is alerted to a possible virus infection if any of the virus signatures are found in a file. In another preferred embodiment, an additional collection of at least one virus signature containing sequences of characters or instructions known to be found in viruses that infect executable computer files is maintained on the computer system. A transformed text-based file is searched for at least one of the additional virus signature, which are not transformed before the search.
126 Citations
23 Claims
-
1. A method of searching a text-based computer file for a computer virus known to infect text-based files using a stored sequence of computer-readable characters associated with the computer virus, comprising the steps of:
-
transforming whitespace within the text-based file in accordance with a whitespace transformation rule to form a transformed text-based file;
transforming whitespace within the stored sequence of computer-readable characters in accordance with said whitespace transformation rule to form a transformed sequence of computer-readable characters; and
searching said transformed text-based file for at least one occurrence of said transformed sequence of computer-readable characters, wherein the computer virus is detected upon an identification of at least one such occurrence. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
examining a predetermined number of characters in the computer file; and
determining whether a percentage of the examined characters that are printable characters exceeds a predetermined percentage.
-
-
5. The method of claim 4, wherein said predetermined percentage is greater then or equal to 90 percent.
-
6. The method of claim 4, wherein printable characters comprise ASCII character codes in the decimal range of 8-13 and 32-126.
-
7. The method of claim 4, wherein said predetermined number of characters is greater than or equal to 100.
-
8. The method of claim 3, wherein said single whitespace character is a space character.
-
9. The method of claim 8, wherein said whitespace sequence comprises at least one from the group consisting of:
- space, tab, vertical tab, line feed, form feed, carriage return, and null characters.
-
10. The method of claim 1, said whitespace comprising at least one whitespace sequence, wherein said whitespace transformation rule is designed to transform said at least one whitespace sequence into a common predetermined non-whitespace sequence.
-
11. The method of claim 10, wherein said at least one whitespace sequence comprises at least one from the group consisting of:
- space, tab, vertical tab, line feed, form feed, carriage return, and null.
-
12. A method of searching for a virus in a computer file that includes whitespace, the method comprising the steps of:
-
storing at least one virus profile;
determining whether the computer file is a text file;
if the computer file is a text file, reformatting the contents of the computer file to convert a sequence of whitespace characters into a single code; and
comparing the contents of the computer file with said at least one virus profile. - View Dependent Claims (13, 14, 15, 16, 17, 18)
examining a predetermined number of characters in the file; and
determining the percentage of the examined characters that are printable characters;
wherein said computer file is determined to be a text file if 90% or more of the predetermined number of characters are printable characters.
-
-
18. The method of claim 17 wherein printable characters comprise ASCII character codes in the decimal range of 8-13 and 32-126.
-
19. A method of searching a computer file for a computer virus comprising the steps of:
-
storing a virus profile comprising a sequence of computer-readable characters associated with a computer virus;
determining whether the computer file is a text-based file;
transforming whitespace within the computer file if the computer file is a text-based file; and
searching said computer file for said virus profile. - View Dependent Claims (20, 21, 22, 23)
identifying a sequence of at least one whitespace character, said sequence containing only non-printable, computer-readable characters; and
replacing said sequence of at least one whitespace character with a code.
-
-
22. The method of claim 21 wherein the code is a single whitespace character.
-
23. The method of claim 21 wherein the code is a single non-whitespace character.
Specification