Gang migration of virtual machines using cluster-wide deduplication

US 9,823,842 B2
Filed: 05/12/2015
Issued: 11/21/2017
Est. Priority Date: 05/12/2014
Status: Active Grant

First Claim

Patent Images

1. A method for transfer of information comprising a plurality of memory pages or sub-pages to a plurality of servers in a server rack, comprising:

determining which of the plurality of memory pages or memory sub-pages have respectively unique content with respect to the plurality of memory pages or sub-pages;

transferring a copy of each memory page or sub-page having unique content to the server rack, substantially without transferring a memory page having redundant content;

determining which of the plurality of servers in the server rack require a respective memory page or sub-page having unique content; and

duplicating the respective memory page or sub-page having unique content within the server rack for each of the plurality of servers in the server rack that requires, but did not receive, the copy of the memory page or sub-page having unique content.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Gang migration refers to the simultaneous live migration of multiple Virtual Machines (VMs) from one set of physical machines to another in response to events such as load spikes and imminent failures. Gang migration generates a large volume of network traffic and can overload the core network links and switches in a datacenter. In this paper, we present an approach to reduce the network overhead of gang migration using global deduplication (GMGD). GMGD identifies and eliminates the retransmission of duplicate memory pages among VMs running on multiple physical machines in the cluster. The design, implementation and evaluation of a GMGD prototype is described using QEMU/KVM VMs. Evaluations on a 30-node Gigabit Ethernet cluster having 10 GigE core links shows that GMGD can reduce the network traffic on core links by up to 65% and the total migration time of VMs by up to 42% when compared to the default migration technique in QEMU/KVM. Furthermore, GMGD has a smaller adverse performance impact on network-bound applications.

Citations

21 Claims

1. A method for transfer of information comprising a plurality of memory pages or sub-pages to a plurality of servers in a server rack, comprising:
- determining which of the plurality of memory pages or memory sub-pages have respectively unique content with respect to the plurality of memory pages or sub-pages;
  
  transferring a copy of each memory page or sub-page having unique content to the server rack, substantially without transferring a memory page having redundant content;
  
  determining which of the plurality of servers in the server rack require a respective memory page or sub-page having unique content; and
  
  duplicating the respective memory page or sub-page having unique content within the server rack for each of the plurality of servers in the server rack that requires, but did not receive, the copy of the memory page or sub-page having unique content.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
- - 2. The method according to claim 1, wherein the transferred copy of a respective memory page or sub-page having unique content is transferred to a respective server in the server rack, and the respective server executes a process to copy the transferred copy of the respective memory page or sub-page for other servers within the server rack that require the transferred copy of the respective memory page or sub-page.
  - 3. The method according to claim 1, wherein the plurality of servers are involved in a gang migration of a plurality of live servers not in the server rack to the plurality of servers in the server rack, whose live functioning is assumed by the plurality of servers in the server rack, each respective live server not in the server rack having at least an associated central processing unit state and a memory state which is transferred to a respective server in the server rack.
  - 4. The method according to claim 1, wherein the plurality of servers are organized in a cluster, running a plurality of virtual machines, which communicate with each other using a communication medium selected from the group consisting of Gigabit Ethernet, 10 GigE, or Infiniband.
  - 5. The method according to claim 1, wherein the plurality of servers implement a plurality of virtual machines, and said determining which of the plurality of servers in the server rack require a respective memory page or memory sub-page having the unique content comprises determining, for each virtual machine, a hash for each memory page or sub-page used by the respective virtual machine.
  - 6. The method according to claim 1, wherein the plurality of servers in the server rack implement a plurality of virtual machines before the transferring, and suppress transmission of memory pages or sub-pages already available in the server rack during a gang migration.
  - 7. The method according to claim 1, wherein said transferring a copy of each memory page or sub-page having unique content to the server rack, substantially without transferring a memory page having redundant content comprises selectively suppressing a transfer of memory pages or sub-pages already stored in the server rack by a process comprising:
    - computing in real time hashes of the memory pages or sub-pages in the server rack;
      
      storing the computed real time hashes in a hash table;
      
      receiving a hash representing a respective memory page or sub-page of a virtual machine to be migrated to the server rack;
      
      comparing the received hash to the computed real time hashes in the hash table and determining a correspondence; and
      
      if the hash does not correspond to a computed real time hash in the hash table,adding the hash of the memory page or sub-page of a virtual machine to be migrated to the server rack to the hash table, andtransferring the copy of the respective memory page or sub-page of the virtual machine to be migrated to the server rack; and
      
      if the hash corresponds to a computed real time hash in the hash table,duplicating the memory page or sub-page within the server rack associated with the entry in the hash table, andsuppressing the transferring of the copy of the memory page or sub-page associated with the received hash to the server rack.
  - 8. The method according to claim 1, wherein said transferring is prioritized with respect to at least one of a memory page or sub-page dirtying rate and a delta difference for dirtied memory pages or sub-pages.
  - 9. The method according to claim 1, wherein each of a plurality of virtual machines outside of the server rack transfer memory pages or sub-pages to the server rack, in a desynchronized manner to avoid a race condition wherein different copies of the same memory page or sub-page from different virtual machines outside of the server rack are sent to the server rack concurrently.
  - 10. The method according to claim 1, wherein said determining which of the plurality of servers in the server rack require a respective memory page or sub-page have redundant content comprises:
    - implementing a distributed indexing mechanism which computes content hashes on a plurality of memory pages or sub-pages of respective virtual machines executing on the plurality of servers in the server rack; and
      
      responding to a query for a respective memory page or sub-page with a location of a respective memory page or sub-page having identical memory content.
  - 11. The method according to claim 10, wherein the distributed indexing mechanism comprises at least one of a distributed hash table and a centralized indexing server.
  - 12. The method according to claim 1, wherein each memory page or sub-page within the server rack has a unique identifier comprising a respective identification of an associated virtual machine, an identification of a respective server in the server rack, a page or sub-page offset and a content hash.
  - 13. The method according to claim 1, further comprising:
    - initiating a live migration of a respective virtual machine from outside the server rack to the server rack, comprising the transferring a copy of each memory page or sub-page having unique content; and
      
      maintaining a copy of the respective virtual machine outside the server rack until at least the live migration of the respective virtual machine is completed.
  - 14. The method according to claim 1, wherein the transferred copy of each memory page or sub-page having unique content is initially stored in at least one source server rack, having a plurality of source servers, wherein each of the at least one source server rack comprises a deduplication server, the respective deduplication server rack:
    - determining a hash of each memory page or sub-page in the respective source server rack;
      
      storing the determined hashes of the memory pages or sub-pages of each source server rack in a respective source server rack hash table, along with a list of duplicate memory pages or sub-pages; and
      
      controlling a deduplication of the memory pages or sub-pages within the source server rack before said transferring the copy of the memory page having the unique content to the server rack.
  - 15. The method according to claim 14, wherein the deduplication server at a source server rack further receives from the server rack a list of which servers of the plurality of servers in the server rack that require a copy of a respective memory page or sub-page.
  - 16. The method according to claim 15, wherein the deduplication server in a respective source server rack:
    - receives from the server rack a list of which of the plurality of servers in the server rack that require a copy of a respective memory page or sub-page;
      
      retrieves a copy of the respective memory page or sub-page;
      
      sends a copy of the retrieved memory page or sub-page to the server rack; and
      
      marks the memory page or sub-page as having been sent to the server rack in the source server rack hash table.
  - 17. The method according to claim 16, wherein:
    - the list of servers of the plurality of servers in the server rack is sorted in order of most recently changed memory page or sub-page of the respective server, andafter a memory page or sub-page is marked as having been sent in the source server rack hash table, references to earlier versions of the sent memory page or sub-page are removed from the list without overwriting the more recent copy of the memory page or sub-page.
  - 18. The method according to claim 1, further comprising:
    - commencing a live gang migration of a plurality of virtual machines, executing on at least one source server rack;
      
      transferring the copy of each memory page or sub-page having unique content of each respective virtual machine, executing on at least one source server rack;
      
      maintaining each respective virtual machine executing on the at least one source server rack until at least one version of each memory page or sub-page of the respective virtual machine having respectively unique content is transferred to the server rack, and then inactivating the respective virtual machine;
      
      transferring versions of memory pages or sub-pages changed subsequent to said determining which of the plurality of memory pages or memory sub-pages have respectively unique content to the server rack; and
      
      activating a virtual machine on the server rack corresponding to the respective virtual machine executing on the source server rack.

19. A method for gang migration of a plurality of servers to a server rack having a network link external to the server rack and an internal data distribution system for communicating within the server rack, comprising:
- determining unique memory pages which lack content redundancy with respect to other memory pages across the plurality of servers to be gang migrated to the server rack;
  
  initiating a gang migration, wherein only a single copy of each unique memory page is transferred to the server rack during the gang migration, along with a reference to the unique memory page for servers that require but do not receive a copy of the unique memory page; and
  
  after receipt of the single copy of each unique memory page within the server rack, communicating the respective unique memory page to each server that requires but did not receive the copy of the respective unique memory page, to thereby duplicate the respective unique memory page within the server rack after having received a single copy of the unique memory page.

20. A system for transfer of information comprising a plurality of memory pages or sub-pages to a plurality of servers in a server rack, comprising:
- a server rack having a plurality of servers;
  
  an external network communication port configured to communicate memory pages or sub-pages and control information;
  
  an internal network communication network configured to communicate between the plurality of servers;
  
  a deduplication process executing on at least one server, configured to;
  
  determine which of the plurality of memory pages or memory sub-pages have respectively unique content with respect to the plurality of memory pages or sub-pages;
  
  control a transfer of a copy of each memory page or sub-page having unique content to the server rack through the external network communication port, substantially without transferring a memory page having redundant content;
  
  determine which of the plurality of servers in the server rack require a respective memory page or sub-page having unique content; and
  
  duplicate the respective memory page or sub-page having unique content within the server rack through the internal communication network, for each of the plurality of servers in the server rack that requires, but did not receive, the copy of the memory page or sub-page having unique content.
- View Dependent Claims (21)
- - 21. The system according to claim 20, further comprising a deduplication server executing the deduplication process outside the server rack, in a source server rack, configured to:
    - determine a hash of each memory page or sub-page in the source server rack;
      
      store the determined hashes of the memory pages or sub-pages in a source server rack hash table, along with a list of duplicate memory pages or sub-pages;
      
      receive from the server rack a list of which servers of the plurality of servers in the server rack require a copy of a respective memory page or sub-page;
      
      send a copy, and after an initial copy is transferred, a reference to a location of the copy of the retrieved memory page or sub-page within the server rack, to each server of the plurality of servers in the server rack that requires a copy of the memory page or sub-page; and
      
      mark the memory page or sub-page as having been sent to the server rack in the source server rack hash table when the copy of the memory page or sub-page is transferred.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
The Research Foundation for The State University of New York (State University of New York)
Original Assignee
The Research Foundation for The State University of New York (State University of New York)
Inventors
Gopalan, Kartik, Deshpande, Umesh
Primary Examiner(s)
Rossiter, Sean D

Application Number

US14/709,957
Publication Number

US 20150324236A1
Time in Patent Office

924 Days
Field of Search

711161, 711162
US Class Current
CPC Class Codes

G06F 11/1453   using de-duplication of the...

G06F 2009/45562   Creating, deleting, cloning...

G06F 3/06   Digital input from, or digi...

G06F 3/0619   in relation to data integri...

G06F 3/0641   De-duplication techniques

G06F 3/065   Replication mechanisms

G06F 3/067   Distributed or networked st...

G06F 9/455   Emulation; Interpretation; ...

G06F 9/45558   Hypervisor-specific managem...

G06F 9/5027   the resource being a machin...

G06F 9/5088   involving task migration

Gang migration of virtual machines using cluster-wide deduplication

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

21 Claims

Specification

Solutions

Use Cases

Quick Links

Gang migration of virtual machines using cluster-wide deduplication

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

21 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links