Understanding High Availability on the NSX Edge Services Gateway

Hello Fellow NSX Operators!

Before I jump into the HA commands, let me briefly preface with a few words about NSX Edge Services Gateway High Availability (simply HA going forward).  You will need to understand the heartbeat path and what type of infrastructure-impacting health events are common to your infrastructure.  You may find yourself troubleshooting High Availability many times because of a change or degradation in the underlying Hosts, Storage or Network.  Be careful with those red herrings.  When HA is implemented with a solid understanding of the underlying infrastructure and its variations, you can enjoy peace of mind in knowing the edge network services are highly available.

This article covers the following topics in regards to HA:
– Implementation considerations
– Troubleshooting commands
– Proactively monitoring HA via syslog

http://www.vmware.com/files/pdf/products/nsx/vmw-nsx-network-virtualization-design-guide.pdf

Edge HA topology graphic from the NSX Network Virtualization design guide


A few HA facts/points/considerations/recommendations…

HA Topology
– It uses an Active/Standby topology.
– When HA is enabled, a second VM is deployed.  The new VM will only be networked to communicate with the primary.
– When HA is disabled, the 2nd VM is destroyed.
  – HA appliances will be deployed based on the user-defined mappings (at these these settings are not dynamic).
– Edge mappings are most easily managed using  /api/4.0/edges/<edgeId>/appliances with the REST api
– Changes appliance settings will trigger an OVF re-deployment of the edge.

HA IP Configuration
– Optional.  If not configured, NSX will assign a valid /30 IP pair using an RFC3927 network.
– If configured manually, valid subnets are system enforced.  10.0.0.0/30 and 10.0.0.1/30 is not valid.  10.0.0.1/30 and 10.0.0.2/30 is valid.

HA vNic Selection
– Optional, it can be left to ANY.
– A minimum of one edge interface is required before enabling HA.
– The recommendation for maximum availability is to configure a network dedicated to the vNIC heartbeating.
– Sharing a vNIC will work without problems as long as the network is not overloaded and available.

HA Timeouts and Heartbeating
  – The default deadtime is 6 seconds
– The current recommended deadtime is 15 seconds (uses a 3 second polling frequency).  There is a tradeoff of service failover time for increased resiliency to lost heartbeats.
– Heartbeats are sent using UDP-694 (the IANA registered port for heartbeats)

HA Appliance Anti-affinity
– Host anti-affinity is handled by system.  When HA is enabled there is a cluster DRS rule added automatically with the name anti-affinity-rule-edge-#, where edge=# is the edge-ID.
anti-affinity
– Storage anti-affinity is not handled by default.  For maximum availability of the edge pair, configure the edge appliances to deploy to different physical storage resources.  Especially important in infrastructure that uses centralized storage.

 


Troubleshooting ESG HA with CLI-based Edge Commands

show service highavailability example output

 nsxe-0> show service highavailability
 Highavailability Status: running
 Highavailability Unit Name: nsxe-0
 Highavailability Unit State: active
 Highavailability Interface(s): vNic_5
 Unit Poll Policy:
    Frequency: 3 seconds
    Deadtime: 15 seconds
 Stateful Sync-up Time: 10 seconds
 Highavailability Healthcheck Status:
    Peer host [vse-1 ]: good
    This host [vse-0 ]: good
 Highavailability Stateful Logical Status:
 File-Sync running
 Connection-Sync running
 xmit        xerr  rcv       rerr
 51219548828 0     42990848  0

show service highavailability connection-sync example output

nsxe-0> show service highavailability connection-sync
connections local:
current active connections: 12693
connections created:            368613263  failed: 0
connections updated:           21695297    failed: 0
connections destroyed:        368600570  failed: 0

connections peer:
current active connections: 0
connections created:          26571 failed: 0
connections updated:         1024 failed: 0
connections destroyed:        26571 failed: 0

traffic processed:
1248602045934 Bytes 6285222215 Pckts

UDP traffic (active device=vNic_5):
51255382200 Bytes sent 43018912 Bytes recv
590146284 Pckts sent 2518471 Pckts recv
0 Error send 0 Error recv

message tracking:
0 Malformed msgs 5863 Lost msgs

show service highavailability connection-sync example output

vse-0> show service highavailability link
Local IP Address: 192.18.0.1/30
Peer IP Address: 192.18.0.2/30

debug packet display / “sniffing” HA heartbeats

Filter using the High Availability vNIC from the root command “show service highavailability”

nsxe-0> debug packet display interface vNic_# port_694
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on vNic_5, link-type EN10MB (Ethernet), capture size 65535 bytes
17:22:50.357722 IP 192.18.0.2.24758 > 192.18.0.1.694: UDP, length 189
17:22:52.709253 IP 192.18.0.1.32165 > 192.18.0.2.694: UDP, length 189
17:22:53.360327 IP 192.18.0.2.24758 > 192.18.0.1.694: UDP, length 190
17:22:55.711667 IP 192.18.0.1.32165 > 192.18.0.2.694: UDP, length 203
17:22:55.711715 IP 192.18.0.1.32165 > 192.18.0.2.694: UDP, length 189
17:22:55.742631 IP 192.18.0.2.24758 > 192.18.0.1.694: UDP, length 203
17:22:56.353520 IP 192.18.0.2.24758 > 192.18.0.1.694: UDP, length 189
17:22:58.716886 IP 192.18.0.1.32165 > 192.18.0.2.694: UDP, length 189
17:22:59.357186 IP 192.18.0.2.24758 > 192.18.0.1.694: UDP, length 189

Viewing Historical HA System Events for an Edge in the Web Client

 

  • Open the vCentter Web Client
  • Open Networking & Security
  • In NSX Edges, double-click the Edge
  • Select the Montor tab
  • Select System Events
  • On the search widget, click the arrow, click Select Columns…
  • Deselect All > Check Module > Type HighAvailability > Click Ok

REST API-based Commands

Query HA Configuration Details on an Edge

GET https://nsxm-ip/api/4.0/edges/edge-#/highavailability/config Example Output

<?xml version="1.0" encoding="UTF-8"?>
<highAvailability>
 <version>6</version>
 <enabled>true</enabled>
 <vnic>2</vnic>
 <ipAddresses>
    <ipAddress>198.18.0.1/30</ipAddress>
    <ipAddress>198.18.0.2/30</ipAddress>
 </ipAddresses>
 <declareDeadTime>15</declareDeadTime>
 <logging>
    <enable>true</enable>
    <logLevel>error</logLevel>
 </logging>
 <security>
 <enabled>false</enabled>
 </security>
</highAvailability>

Delete Edge HA Configuration on an Edge

DELETE https://nsxm-ip/api/4.0/edges/edge-#/highavailability/config

 

Monitoring High Availability Health Proactively

– Open your (vCenter Log Insight ), Splunk or log aggregation solution of choice.
– Build aview of all edge logging (use regex or glob based matches to filter according to your naming convention).

Heartbeat Drops
– Examine matches on the text “lost packet”. Build an alerting rule based on your results.
– When the infrastructure is healthy, there should be not be any HA packets lost.

Example match

Sep 19 11:34:14 nsxe-0 ha[]: [default]: [1371]: WARN: 1 lost packet(s) for [nsxe-0] [37:39]

Late Heartbeats

– Examine matches on the text “Late heartbeat”. Build an alerting rule based on your results.
– Late heartbeats may indicate infrastructure problems.  Possible resource constraints or both edges in the HA pair.
– This can also result in a split brain state.

Example match

Jul  3 09:46:48 nsxe-0 heartbeat: [1454]: WARN: Late heartbeat: Node
nsxe-1: interval 24921 ms

Lost and late heartbeats are the early indicators.  Early indicators are your best friends.  Keep a close eye out for these.

Monitor NSX Manager for Switchover Events

– Filter logging based on NSX Manager SystemEvent, you can use the text [SystemEvent] to filter.
– Examine matches for Event 30202 and 30203 (Edge switching to ACTIVE & STANDBY, respectively)
– Any single event source with more than one or two events should raise a red flag. Any unplanned switchover events should be researched. Build an alerting rule based on your findings.

Example match

Sep 20 20:50:05 nsxm-0 [SystemEvent] Time:'Sat Sep 20 20:49:13.000 GMT 2014', Severity:'High', Event Source:'vm-13950', Code:'30203', Event Message:'vShield Edge HighAvailability switch over happened. VM has moved to STANDBY state.', Module:'vShield Edge HighAvailability'

Split-Brain Indicators 

– Look for the text “returning after partition”; Look for the text “Deadtime value may be too small”
– Matches on these can indicate that the state of HA has most likely entered the split brain state.  Network Services will be mostly unavailable until the condition is resolved.
– Hopefully these do not exist in your environment. Build a preventive alerting rule. Matches are immediately actionable.

That is all folks. Hope this helps.

Advertisements

Snapshot Quiesce

◎ Configuring quiesced snapshots

When creating a snapshot from the VM console, you are presented with the option to “Quiesce” the guest file system.

That’s it for the configuration, the checkbox you see below.

 

 

 

 

 

 

 

 

 

 

Easy as pie right?  But..

~

◎ Hold up; What does Quiesce actually mean (I have not clue) ?

At this point I have to fess up – being a true ESL individual – I’ve never heard the word “Quiesce” before, so the option means nothing to me.  Had to do a little bit of digging.   Here’s what I found:

thefreedictionary.com/quiesce

Qui`esce´

v. i. 1. To be silent, as a letter; to have no sound.

[imp. & p. p. Quiesced ; p. pr. & vb. n. Quiescing .]

dictionary.reference.com/browse/quiesce   quiesce definition – networking 

 

 To render quiescent, i.e. temporarily inactive or disabled. For example to quiesce a device (such as a digital modem). It is also a system command in MAX TNT software which is used to “Temporarily disable a modem or DS0 channel”. 

OK – so at this point I get where this is headed.  I’m satisfied with my understanding of the word 😀

I checked the vSphere help file.  Lo and behold! I’m greeted by a perfectly clear explanation:

“Select the Quiesce guest file system (Needs VMware Tools installed) check box to pause running processes on the guest operating system so that file system contents are in a known consistent state when the snapshot is taken. This applies only to virtual machines that are powered on.”

But OCD kicked in.  I had to know…

◎ What else is there to know about Quiescing??  (Tell me more)

“Quiesce: If the <quiesce> flag is 1 or true, and the virtual machine is powered on when the snapshot is taken, VMware Tools is used to quiesce the file system in the virtual machine. Quiescing a file system is a process of bringing the on-disk data of a physical or virtual computer into a state suitable for backups. This process might include such operations as flushing dirty buffers from the operating systems in-memory cache to disk, or other higher-level application-specific tasks. 

Note:Quiescing indicates pausing or altering the state of running processes on a computer, particularly those that might modify information stored on disk during a backup, to guarantee a consistent and usable backup.”

Note: Depending on the guest operating system, the quiescing operation can be done by the sync driver, the vmsync module, or Microsoft’s Volume Shadow Copy (VSS) service

( Tell me more )

VMware products require file systems within a guest operating system to be quiesced prior to a snapshot operation for the purposes of backup and data integrity. VMware products which use quiesced snapshots include, but are not limited to, VMware Consolidated Backup and VMware Data Recovery. 

Virtual machine generating heavy I/O workload may encounter issues when quiescing prior to a snapshot operation. These issues may be related to the component that does the quiescing or the custom quiescing scripts as described in the Virtual Machine Backup Guide.

Services which have been known to generate heavy I/O workload include, but are not limited to, Exchange, Active Directory, LDAP, and MS-SQL.

The quiescing operation is done by an optional VMware Tools component called the SYNC driver.

As of ESX 3.5 Update 2, quiescing is also done by Microsoft’s Volume Shadow Copy Service (VSS). VSS is provided by Microsoft in their operating systems as of Windows Server 2003 and Windows XP.
Operating systems which do not have the Volume Shadow Copy Service make use of the SYNC driver for quiescing operations.

~

◎ Troubleshooting Quiescing with the SYNC driver

A guest operating system may appear to be unresponsive when there is a conflict between the SYNC driver and services generating heavy I/O. If installed, the SYNC driver holds incoming I/O writes while it flushes all dirty data to a disk, thus making file systems consistent. Under heavy loads, the delay in I/O can become too long, which affects many time-sensitive applications, including the services which generate the heavy I/O (such as an Exchange Server). If writes issued by these services get delayed for too long, the service may stop and issues error messages.

To avoid this issue, disable the SYNC driver or stop the service generating heavy I/O before taking a snapshot.

Note: The sync driver is only required for legacy versions of Windows such as Windows XP and Windows 2000 which do not include the Microsoft VSS service. Updated versions of VMware Tools will automatically uninstall the SYNC driver.

Disabling the VCB SYNC Driver (LGTO_Sync)

Disabling the SYNC driver allows you to keep the heavy I/O services on-line, but results in snapshots being only crash-consistent.

To disable the VCB SYNC driver:

  1. In Device Manager, click View > Show hidden devices.
  2. Expand Non-Plug and Play Drivers.
  3. Right-click Sync Driver and click Disable.
  4. Click Yes twice to disable the device and restart the computer. 

Stopping services generating heavy I/O

Use the following pre-freeze and post-thaw scripts to take the service generating heavy I/O offline for approximately 60 seconds and then restart it again after the snapshot is taken. This approach leaves the service inactive, but keeps the SYNC driver enabled while the snapshot is taken, ensuring application consistency. Using this method, you create the quiesced snapshot of guest operating system.
 
This example shuts down Exchange Services prior to a quiescing operation:

C:\Windows\pre-freeze-script.bat
@echo off
net stop MSExchangeSA /yes


C:\Windows\post-thaw-script.bat
@echo off
Net Start MsExchangeSA
Net Start MsExchangeIS
Net Start MsExchangeMTA

 

And that’s all the time I have right now – I hope this has been informative.

~

 

Gabe@networkdojo.net

 

 

 

 

 

REFERENCES:

vShpere Help File
vmware KB 101518 – Understanding VM snapshots
vmware KB 5962168 – VM can freeze under load when you take quiesced snapshots