⇚==================================================================================================⇛

- February 06, 2013

VMware vSphere Metro Storage Cluster (vMSC):

VMware vSphere Metro Storage Cluster (vMSC) is not a product but a certified configuration where a vSphere cluster spans geographical locations. This could be spread across a campus, metropolitan area or a larger area up to 200 km apart.

vMSC was introduced with vSphere 5.0.

It relies on a stretched storage solution such a NetApp MetroCluster and a stretched layer 2 VLAN. The storage must be treated as a single storage solution that spans both sites. The storage is synchronously replicated between the sites so that both sites are always in sync and there is zero data loss in the event of a failure. The storage solution must allow the datastores/LUNs to be access from either location.

This brings the functionality of a local VMware vSphere cluster to hosts spread across two locations so that VMware HA, DRS and vMotion can be performed across the sites as if all the hosts were local. But is this a better solution than SRM? They are different solutions aimed at resolving different problems. vMSC is targeted at disaster avoidance whereas SRM is targeted at disaster recovery.

vMSC achieves disaster avoidance by allowing you to move workloads off failing components without outages.

SRM achieves disaster recovery by automating recovery plans to bring workloads back online in a controlled manor following a disaster.

SRM can be used in a planned migration to move workloads from one site to another site, for example when maintenance is required at the primary site; however an outage is always required to move the workloads with SRM. vMSC can restart workloads at the secondary site when the other site fails using VMware HA but there is little control over the order the failed workloads restart.

⇚==================================================================================================⇛

- January 07, 2013

vSphere VM Memory Statistics :

There still seems to be some confusion over what the Active Memory statistic of a virtual machine within vSphere. See example below.

You can see that this virtual machine is configured with 4096 MB of memory and the Active Guest Memory is being reported as 696 MB. So what is this Active Guest Memory? If I look within the operating system of this virtual machine I can see that at this point in time it is using 2.48 GB of memory, see below.

VMware define Active Memory as “Amount of memory that is actively used, as estimated by VMkernel based on recently touched memory pages”. The overhead of monitoring every memory page access would be too much for the hypervisor so a random sampling is used to estimate the active memory. So, how often is “recently”? This depends on where you are looking at the statistic, in vCenter it is refreshed every 20 seconds so it is showing the amount of memory that has been touched in the last 20 seconds.

If we look at the memory charts in vCenter for this virtual machine we can see that the active memory is fairly static at between 750 MB and 1,000 MB.

You could assume from this that the virtual machine is only using about 1 GB of memory; however we saw the operating system reporting that about 2.5 GB was being used. What we don’t know from the information in the graph is if for each sampling period it was the same memory pages that were touched. It could be that in one 20 second period 750 MB of memory was accessed and in the next 20 seconds a different 750 MB of memory was accessed. For this reason this statistic is not a good value to take to right-size the virtual machine.

Consumed Memory

What I notice for Windows Servers is that the Consumed Host Memory is always almost equivalent to the amount of memory configured on a virtual machine. Taking the example above where the virtual machine is configured for 4096 MB of memory the consumed memory is being reported as 4144 MB. This is higher than the configured memory as it also includes the memory overhead used by the hypervisor to run the virtual machine. My understand of why Windows Server report that they are consuming all of the memory they are allocated for, even though we can see from the operating system in this case that it is only using 2.5 GB is because when the server boots up it zeros out all of the memory pages and therefore the hypervisor needs to present these memory pages to the virtual machine.

⇚==================================================================================================⇛

-January 03, 2013

RDM Path Selection Policy (PSP) with Microsoft Clustering

Prior to vSphere 5.5 the Round Robin Path Selection Policy (VMW_PSP_RR) is not supported for the shared disks of a Microsoft cluster. You may find that the ESXi multipathing claim rules are set so that when the RDMs are discovered the PSP is automatically set to Round Robin, so you will want to change this. You will probably want to keep the Round Robin PSP for the other non-shared clustered disks such as LUNs used for VMFS volumes therefore you probably do not want to change the default claim rules.

From vSphere 5.5 the Round Robin PSP is supported for the clusters shared disks.

There is also a NetApp KnowledgeBase article that states that ALUA should not be enabled on the igroup when using Microsoft clustering with shared RDMs prior to vSphere 5.5. See https://kb.netapp.com/support/index?page=content&id=2013316. They have 3 solutions: –

Disable ALUA on the igroup for the ESXi hosts with Microsoft Windows Clustered servers.

Use dedicated initiators for the shared clustered RDMs with ALUA disabled and different initiators for the other LUNs such as VMFS volumes and non-shared RDMs.

Use iSCSI within the Windows Servers for the shared disks.

⇚==================================================================================================⇛

HA Deepdive

Introduction to vSphere High Availability

Availability has traditionally been one of the most important aspects when providing services. When providing services on a shared platform like VMware vSphere, the impact of downtime exponentially grows as many services run on a single physical machine. As such VMware engineered a feature called VMware vSphere High Availability. VMware vSphere High Availability, hereafter simply referred to as HA, provides a simple and cost effective solution to increase availability for any application running in a virtual machine regardless of its operating system. It is configured using a couple of simple steps through vCenter Server (vCenter) and as such provides a uniform and simple interface. HA enables you to create a cluster out of multiple ESXi hosts. This will allow you to protect virtual machines and their workloads. In the event of a failure of one of the hosts in the cluster, impacted virtual machines are automatically restarted on other ESXi hosts within that same VMware vSphere Cluster (cluster).

Port	Protocol	Direction
8182	UDP	Inbound
8182	TCP	Inbound
8182	UDP	Outbound
8182	TCP	Outbound

State	NetworkHeartbeat	StorageHeartbeat	Host Live-nessPing	Isolation Criteria Met
Running	Yes	N/A	N/A	N/A
Isolated	No	Yes	No	Yes
Partitioned	No	Yes	No	No
Failed	No	No	No	N/A
FDM Agent Down	N/A	N/A	Yes	N/A

Likelihood that host will retain access to VM datastore	Likelihood VMs will retain access to VM network	Recommended Isolation Policy	Rationale
Likely	Likely	Disabled	Virtual machine is running fine, no reason to power it off
Likely	Unlikely	Either Disabled On or Shutdown.	Choose shutdown to allow HA to restart virtual machines on hosts that are not isolated and hence are likely to have access to storage
Unlikely	Likely	Power Off	Use Power Off to avoid having two instances of the same virtual machine on the virtual machine network
Unlikely	Unlikely	Disabled or Power Off	Disabled if the virtual machine can recover from the network/datastore outage if it is not restarted because of the isolation, and Power Off if it likely can’t.

Host	Number of slots
ESXi-01	16 Slots
ESXi-02	16 Slots
ESXi-03	32 Slots

⇚==================================================================================================⇛

VMware Networking Concepts
Troubleshooting
Performance Troubleshooting
Firmware Upgrade
Configuration

Coming soon....
But please email us on - nitinmhalim@virtualconsultancyservices.com if you need more details(desgin/planning,configuration/implementation, troubleshooting, maintenance) about VMware products in emergency case. We will provide best solution at right direction.

vSphere

⇚==================================================================================================⇛

VMware vSphere Metro Storage Cluster (vMSC):

⇚==================================================================================================⇛

⇚==================================================================================================⇛

RDM Path Selection Policy (PSP) with Microsoft Clustering

⇚==================================================================================================⇛

Introduction to vSphere High Availability

vSphere 6.0

What’s New in 6.0?

What is required for HA to Work?

Prerequisites

Firewall Requirements

Configuring vSphere High Availability

Components of High Availability

HOSTD Agent

vCenter

Fundamental Concepts

Master Agent

Election

Slaves

Files for both Slave and Master

Remote Files

Local Files

Heartbeating

Network Heartbeating

Datastore Heartbeating

Isolated versus Partitioned

Virtual Machine Protection

Restarting Virtual Machines

Restart Priority and Order

Restart Retries

Failed Host

The Failure of a Slave

The Failure of a Master

Isolation Response and Detection

Isolation Response

Isolation Detection

Isolation of a Slave

Isolation of a Master

Additional Checks

Selecting an Additional Isolation Address

Isolation Policy Delay

Restarting Virtual Machines

Component Protection

vSphere HA nuggets

VMware vSAN and Virtual Volumes specifics

HA and vSAN

HA and Virtual Volumes

Adding Resiliency to HA (Network Redundancy)

Corner Case Scenario: Split-Brain

Link State Tracking

Admission Control

Admission Control Policy

Admission Control Mechanisms

Host Failures Cluster Tolerates

Unbalanced Configurations and Impact on Slot Calculation

Percentage of Cluster Resources Reserved

Failover Hosts

Decision Making Time

Host Failures Cluster Tolerates

Percentage as Cluster Resources Reserved

Specify Failover Hosts

Recommendations

Selecting the Right Percentage

Aggressive Approach

Adding Hosts to Your Cluster

How to Define Your Percentage?

⇚==================================================================================================⇛

Comments

Post a Comment

Popular posts from this blog

esxi-host-shows-disconnected-vmware

ESXi : Lost uplink redundancy on virtual switch "vSwitch0". Physical NIC vmnic0 is down

"Host IPMI system event log status" alarm in vCenter Server