×

Document object model (DOM) based page uniqueness detection

  • US 8,489,605 B2
  • Filed: 06/23/2011
  • Issued: 07/16/2013
  • Est. Priority Date: 06/30/2010
  • Status: Expired due to Fees
First Claim
Patent Images

1. A non-transitory computer system comprising:

  • a host system in communication with at least one client system over a network;

    a page based unique ID generation application for execution on the host system, the page based for unique ID generation application including logic for implementing a method comprising;

    receiving a hypertext markup language (HTML) page at a computer;

    identifying HTML page elements in response to the receiving, the HTML page elements comprising parent nodes, the parent nodes comprising child nodes;

    processing each of the HTML page elements, the processing comprising;

    grouping the child nodes by parent node into a group of child nodes;

    detecting patterns in the group of child nodes in response to the grouping;

    reducing the group of child nodes to text strings in response to the detecting; and

    storing the text strings as text values in the parent nodes; and

    generating a unique identifier (ID) of the HTML page in response to the processing;

    wherein the HTML page is a Web 2.0 page, the Web 2.0 page comprising content, the content being generated dynamically and filtering HTML page elements in response to the identifying, the filtering removing the child nodes and the parent nodes that meet filter criteria, the filter criteria comprising;

    extensible markup language path language instructions;

    regular expression (regex) instructions; and

    a list of html nodes.

View all claims
  • 5 Assignments
Timeline View
Assignment View
    ×
    ×