×

File deduplication and scan reduction in a virtualization environment

  • US 9,424,058 B1
  • Filed: 09/23/2013
  • Issued: 08/23/2016
  • Est. Priority Date: 09/23/2013
  • Status: Active Grant
First Claim
Patent Images

1. A computer implemented method for enabling file deduplication and scan reduction across multiple virtual machines in a virtualization environment of a single host computer, the method comprising the steps of:

  • creating a virtual machine template based upon which to create multiple virtual machines on the host computer, wherein the virtual machine template comprises a description of a virtual machine in a known good state having a file system containing at least some files to deduplicate across the multiple virtual machines;

    for each specific file in the file system of the virtual machine template to deduplicate across the multiple virtual machines, deduplicating the specific file by;

    generating a hash of content of the specific file;

    storing the generated hash locally on the virtual machine template in association with the specific file; and

    moving the content of the specific file from the virtual machine template to a central file store residing independently of the virtual machine template and the multiple virtual machines;

    creating multiple virtual machines by cloning the virtual machine template, wherein each one of the multiple virtual machines cloned from the virtual machine template contains a copy of the file system of the virtual machine template and a copy of the generated hashes of the content of the deduplicated files, the copy of the hashes being stored locally on the specific virtual machine in association with the corresponding deduplicated files;

    monitoring file access operations on each one of the multiple virtual machines cloned from the virtual machine template;

    on each one of the multiple virtual machines cloned from the virtual machine template, in response to detecting an attempt to access a deduplicated file the content of which is in the central file store and is not present on the specific virtual machine, using a corresponding hash stored locally on the specific virtual machine in association with the specific file to retrieve the content of the specific file from the central file store;

    in response to a specific virtual machine updating a deduplicated file the content of which is in the central file store and is not present on the specific virtual machine;

    storing the updated file on the specific virtual machine and not in the central file store; and

    deleting the hash stored in association with the updated file from the specific virtual machine; and

    in response to a specific virtual machine deleting a deduplicated file the content of which is in the central file store and is not present on the specific virtual machine;

    deleting the hash stored in association with the specific file from the specific virtual machine and deleting an entry for the specific file from the file system of the specific virtual machine.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×