Method and system for identifying network addresses associated with suspect network destinations

US 7,590,707 B2
Filed: 08/07/2006
Issued: 09/15/2009
Est. Priority Date: 08/07/2006
Status: Active Grant

First Claim

Patent Images

1. A method for identifying a network address associated with a suspect network destination, the method comprising:

collecting a set of Uniform Resource Locators (URLs), each URL in the set of URLs being associated with a suspect network destination;

segmenting each URL in the set of URLs into a set of component partsfor each URL in the set of URLs, classifying each component part in the set of component parts from that URL as one of a primary domain, a subdomain, and a page;

for each URL in the set of URLs, hashing each component part in the set of component parts from that URL to produce a hash value for that component part;

storing in a database the hash values of the component parts of the URLs in the set of URLs;

receiving a target URL to be analyzed;

segmenting the target URL into a set of component parts;

classifying each component part in the set of component parts from the target URL as one of a primary domain, a subdomain, and a page;

hashing each component part in the set of component parts from the target URL to produce a hash value for that component part;

comparing the hash values of the set of component parts from the target URL with the hash values stored in the database;

computing a score that indicates the extent to which the hash values of the set of component parts from the target URL match hash values stored in the database; and

taking corrective action, when the score satisfies a predetermined criterion, and wherein the predetermined criterion is that the score exceed a predetermined threshold.

View all claims

9 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and system for identifying network addresses associated with suspect network destinations is described. One embodiment receives a target Uniform Resource Locator (URL) to be analyzed; segments the target URL into a set of component parts; classifies each component part in the set of component parts as a primary domain, a subdomain, or a page; hashes each component part in the set of component parts to produce a hash value for that component part; compares the hash values of the set of component parts from the target URL with hash values stored in a database, the hash values stored in the database having been obtained by segmenting, classifying, and hashing, in the same manner as the target URL, each of a set of URLs known to be associated with suspect network destinations; computing a score that indicates the extent to which the hash values of the set of component parts from the target URL match hash values stored in the database; and taking corrective action, when the score satisfies a predetermined criterion. In one embodiment, taking correction action includes notifying a user that the target URL is believed to be associated with a suspect network destination. In another embodiment, taking corrective action includes blocking a network connection between a computer and the network destination associated with the target URL.

Citations

20 Claims

1. A method for identifying a network address associated with a suspect network destination, the method comprising:
- collecting a set of Uniform Resource Locators (URLs), each URL in the set of URLs being associated with a suspect network destination;
  
  segmenting each URL in the set of URLs into a set of component partsfor each URL in the set of URLs, classifying each component part in the set of component parts from that URL as one of a primary domain, a subdomain, and a page;
  
  for each URL in the set of URLs, hashing each component part in the set of component parts from that URL to produce a hash value for that component part;
  
  storing in a database the hash values of the component parts of the URLs in the set of URLs;
  
  receiving a target URL to be analyzed;
  
  segmenting the target URL into a set of component parts;
  
  classifying each component part in the set of component parts from the target URL as one of a primary domain, a subdomain, and a page;
  
  hashing each component part in the set of component parts from the target URL to produce a hash value for that component part;
  
  comparing the hash values of the set of component parts from the target URL with the hash values stored in the database;
  
  computing a score that indicates the extent to which the hash values of the set of component parts from the target URL match hash values stored in the database; and
  
  taking corrective action, when the score satisfies a predetermined criterion, and wherein the predetermined criterion is that the score exceed a predetermined threshold.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The method of claim 1, wherein a suspect network destination is a network destination that is associated with pestware.
  - 3. The method of claim 1, wherein taking corrective action includes notifying a user that the target URL is believed to be associated with a suspect network destination.
  - 4. The method of claim 1, wherein taking corrective action includes preventing a connection between a computer and a network destination associated with the target URL.
  - 5. The method of claim 1, wherein the comparing is performed for hash values of component parts classified as primary domains, subdomains, and pages, in that order.

6. A method for identifying a network address associated with a suspect network destination, the method comprising:
- receiving a target Uniform Resource Locator (URL) to be analyzed;
  
  segmenting the target URL into a set of component parts;
  
  classifying each component part in the set of component parts from the target URL as one of a primary domain, a subdomain, and a page;
  
  hashing each component part in the set of component parts from the target URL to produce a hash value for that component part, the hash value having a classification that coincides with the classifying of that component part;
  
  comparing the hash values of the set of component parts from the target URL with hash values stored in a database, the hash values stored in the database having been obtained by segmenting, classifying, and hashing, in the same manner as the target URL, each of a set of URLs known to be associated with suspect network destinations;
  
  computing a score that indicates the extent to which the hash values of the set of component parts from the target URL match hash values stored in the database; and
  
  taking corrective action, when the score satisfies a predetermined criterion, and wherein the predetermined criterion is that the score exceed a predetermined threshold.
- View Dependent Claims (7, 8, 9, 10, 11, 12, 13, 14, 15)
- - 7. The method of claim 6, wherein computing the score includes:
    - assigning a partial score to each match between a hash value of a component part in the set of component parts from the target URL and a hash value stored in the database, the partial score being weighted based on the classification of the matching hash values; and
      
      combining the partial scores from the target URL to produce the score.
  - 8. The method of claim 7, wherein a match that occurs in an incorrect position within an ordered sequence of hash values as determined by the database is weighted less heavily than a match that occurs in a correct position within the ordered sequence of hash values.
  - 9. The method of claim 7, wherein primary-domain matches are weighted more heavily than page matches.
  - 10. The method of claim 7, wherein page matches are weighted more heavily than primary-domain matches.
  - 11. The method of claim 7, wherein, in assigning the partial score, how heavily a classification is weighted is configurable by a user.
  - 12. The method of claim 6, wherein taking corrective action includes notifying a user that the target URL is believed to be associated with a suspect network destination.
  - 13. The method of claim 6, wherein taking corrective action includes preventing a connection between a computer and a network destination associated with the target URL.
  - 14. The method of claim 6, wherein the predetermined threshold is adjustable by a user.
  - 15. The method of claim 6, wherein the predetermined criterion is that a hash value of a primary domain in the target URL matches a primary-domain hash value in the database.

16. A system for identifying a network address associated with a suspect network destination, the system comprising:
- a segmentation module configured to segment a target Uniform Resource Locator (URL) into a set of component parts;
  
  a classification module configured to classify each component part in the set of component parts as one of a primary domain, a subdomain, and a page;
  
  a hashing module configured to compute a hash value for each component part in the set of component parts;
  
  a database containing hash values obtained from a set of URLs known to be associated with suspect network destinations, each URL in the set of URLs having been segmented, classified, and hashed in a manner analogous to the target URL;
  
  a comparison module configured to;
  
  compare the hash values of the component parts in the set of component parts with hash values stored in the database; and
  
  compute a score that indicates the extent to which the hash values of the component parts in the set of component parts match hash values stored in the database; and
  
  a security module configured to take corrective action when the score satisfies a predetermined criterion, and wherein the predetermined criterion is that the score exceed a predetermined threshold.
- View Dependent Claims (17, 18, 19, 20)
- - 17. The system of claim 16, wherein the database includes a primary-domain hash table containing a plurality of entries, each entry including a hash value associated with a primary domain and a pointer to a control structure, the control structure containing at least one of a pointer to a subdomain hash table and a pointer to a flat list of hash values associated with one or more pages, the subdomain hash table containing at least one pointer to a hash value associated with a subdomain.
  - 18. The system of claim 17, wherein the comparison module is configured to compare the hash values of the component parts in the set of component parts with the hash values stored in the database by traversing the database from the primary-domain hash table to a subdomain hash table to a flat list of hash values associated with pages, in that order.
  - 19. The system of claim 16, wherein the security module is configured to take corrective action by alerting a user that the target URL is believed to be associated with a suspect network destination.
  - 20. The system of claim 16, wherein the security module is configured to take corrective action by blocking a connection between a computer and a network destination associated with the target URL.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Carbonite LLC (Open Text Corporation)
Original Assignee
Webroot Software Incorporated (Open Text Corporation)
Inventors
Shifman, Craig Mitchell, McCloy, Harry Murphey III
Primary Examiner(s)
NGUYEN, PHUOC H

Application Number

US11/462,781
Publication Number

US 20080034073A1
Time in Patent Office

1,135 Days
Field of Search

709/217, 709/223, 709/238, 726 22- 25
US Class Current

709/217
CPC Class Codes

G06F 2221/2119   Authenticating web pages, e...

H04L 63/0236   Filtering by address, proto...

H04L 63/168   above the transport layer

Method and system for identifying network addresses associated with suspect network destinations

First Claim

9 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Method and system for identifying network addresses associated with suspect network destinations

First Claim

9 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links