×

Methods and systems for uniquely identifying digital content for eDiscovery

  • US 9,880,983 B2
  • Filed: 06/02/2014
  • Issued: 01/30/2018
  • Est. Priority Date: 06/04/2013
  • Status: Active Grant
First Claim
Patent Images

1. A method of enabling changes in website content to be detected, the method comprising:

  • receiving an address;

    accessing at a first time period, by a computerized system comprising at least one computing device, a web page corresponding to the address;

    identifying, by the system, HTML web page text of the web page accessed at the first time period;

    identifying, by the system, items of content linked to by the web page accessed at the first time period;

    storing the identified HTML web page text accessed at the first time period;

    accessing and storing the items of content linked to by the web page accessed at the first time period;

    calculating, by the system, a first hash value corresponding to the identified HTML web page text accessed at the first time period, wherein calculating the first hash value does not include in the calculation of the first hash value the items of content linked to by the web page accessed at the first time period;

    calculating a first set of hash values for respective items of content linked to by the web page accessed at the first time period;

    calculating a first aggregated hash value based on the first hash value and the first set of hash values for respective items of content linked to by the web page accessed at the first time period;

    storing the first hash value, the first set of hash values, and the first aggregated hash value in association with a date and time corresponding to the first time period and in association with a first identifier;

    accessing, at a second time period, the web page corresponding to the address;

    identifying, by the system, HTML web page text of the web page accessed at the second time period;

    identifying, by the system, items of content linked to by the web page accessed at the second time period;

    storing the identified HTML web page text accessed at the second time period;

    accessing and storing the items of content linked to by the web page accessed at the second time period;

    calculating, by the system, a second hash value corresponding to the identified HTML web page text accessed at the second time period;

    calculating a second set of hash values for respective items of content linked to by the web page accessed at the second time period;

    calculating a second aggregated hash value based on the second hash value and the second set of hash values for respective items of content linked to by the web page accessed at the second time period;

    storing the second hash value, the second set of hash values, and the second aggregated hash value in association with a date and time corresponding to the second time period;

    using the first hash value, corresponding to the identified HTML web page text accessed at the first time period, and the second hash value, corresponding to the identified HTML web page text accessed at the second time period, detecting whether the webpage HTML text has changed, and if the webpage HTML text has changed, and providing a first visual indication indicating changes in the webpage HTML text;

    using the first set of hash values for respective items of content linked to by the web page accessed at the first time period, and the second set of hash values for respective items of content linked to by the web page accessed at the second time period, detecting if the respective items of content linked to by the web page have changed and providing a second visual indication indicating changes in the content linked to by the web page and indicating which content linked to by the web page has changed;

    using the first aggregated hash value, calculated based on the first hash value and the first set of hash values for respective items of content linked to by the web page accessed at the first time period, the second aggregated hash value, calculated based on the second hash value and the second set of hash values for respective items of content linked to by the web page accessed at the second time period, to detect whether the webpage or respective items of content linked to by the web page have changed, and providing a corresponding third visual indication.

View all claims
  • 5 Assignments
Timeline View
Assignment View
    ×
    ×