VMware vMotion, how it works in background while moving from one ESXi host to another

- May 13, 2018

Though vMotion and Storage vMotion each serve a different purpose, they feature a large amount of overlapping technology. Therefore these two vSphere functions fit well together in one chapter. Both technologies are proactive. This means vMotion is used to migrate virtual machines between running ESX-hosts while Storage vMotion migrates the data between fully functional data stores on the virtual machines. Both vSphere features are no longer usable when the source or the destination is offline.

VMware vMotion and Storage vMotion effectively protect against downtime because they can bridge maintenance windows on the hosts and datastores without any interruption of service. However, contrary to popular speculation, they are not tools designed to increase high availability.

vMotion

Readers have certainly already heard of VMware vMotion, the live migration function within VMware vSphere , so this introduction will be brief. vMotion is the tool with which active virtual systems can be migrated from one ESX-host to another without any interruption to the virtual machines themselves or to their provided services. With vSphere 5.x previous limits on simultaneous migrations of virtual machines on an ESXi-host- and datastore-basis have been lifted. This procedure was previously only possible as a cold migration process with offline VMs.

Figure 1: VMware vMotion or VMware Storage vMotion makes it possible to move an active VM on a host basis as well as on a datastore basis.

The vMotion process has become so refined that even in tests at trade shows involving hundreds of thousands of virtual machines vMotion migrations never lost a VM or distupted a VM’s services. However not every virtual machine is well suited for vMotion. This point will be further discussed later in the chapter.

Functionality

Upon delving a bit deeper into vMotion one must admit that the functionality is both simple and ingenious. It ensures system reliability during a traditionally problematic time – maintenance of the host. Additionally it is easy to notice in vMotion how important it really is to separate the hardware from both the operating system and the applications.

ooking at the details of the vMotion process from the point of view of a virtual machine.

Figure 3: The vMotion procedure from the point of view of the virtual machine

The first step is to ensure that the source VM can be operated on the chosen destination server.
Then a second VM process is started on the target system and the resources are reserved.
Next a system memory checkpoint is created. This means all changes to the source VM are written to an extra memory area.
The contents of the system memory recorded at the checkpoint are transferred to the target VM.
The checkpoint/checkpoint-restore process is repeated until only the smallest changesets remain in the target VM’s memory.
The CPU of the source VM is stopped.
The last modifications to the main memory are transferred to the target VM in milliseconds.
The vMotion process is ended and a reverse ARP packet is sent to the physical switch (important: Notify Switches must be activated in the properties of the virtual switch). Hard disk access is taken over by the target ESX.
The source VM is shut down. This means the VM process on the source ESX is deleted.

One additional comment about what the vMotion checkpoints record:

all devices and their status
CPU registers
main memory contents
a serialization of the status for transmission over the network

As you can see vMotion is concerned mostly with the transfer of the main memory contents from one ESX server to another, with a final notification telling the physical network about the new interface over which the VM is reachable sent once the process is finished. The guest system of course does not notice anything.

The following table shows an example how the memory transfer can be computed.

Pre-Copy Iteration	Main memory to be transferred	Time needed for the transfer	Change in memory during the transfer
1	2.048 MB	16 seconds	512 MB
2	512 MB	4 seconds	128 MB
3	128 MB	1 second	32 MB
4	32 MB	0,25 seconds	8 MB
5	8 MB	vMotion cutoff, since the residual transmission takes only ~0.06 seconds

Table: Main memory copy during vMotion

As you can see in Table 1.1, the copying of the main memory is performed successively in multiple steps until a CPU stop is possible that doesn’t lead to a system crash.

vMotion is made up of many components, which are responsible for managing various parts of the process. vCenter performs the first configuration check and starts the process over the vpxa- and hostd-components in which a pseudo-VM is started as a container on the target host (see Figure 2). The vMotion module starts the actual vMotion process and controls the data transfer.

Figure 2: Components of VMware vMotion – VMkernel adapter for vMotion

vCenter validates and starts the process, but it is not involved in the actual data transfer. Therefore an active vMotion process must always be allowed to run to completion even in the case that vCenter crashes. In the case of a crash it can happen that vCenter still has the source VM in its database and doesn’t yet know about the target VM in the new location. If this happens it helps to restart the management agent or to perform a disconnect/reconnect of the ESX host in vCenter.

The vMotion user interface must be configured by an administrator and is installed on the VMkernel port. This means the vMotion data transfer must be enabled in the settings of the VMkernel port.

Search This Blog

VMware Consultancy Services