Understanding High Availability on the NSX Edge Services Gateway

Hello Fellow NSX Operators!

Before I jump into the HA commands, let me briefly preface with a few words about NSX Edge Services Gateway High Availability (simply HA going forward).  You will need to understand the heartbeat path and what type of infrastructure-impacting health events are common to your infrastructure.  You may find yourself troubleshooting High Availability many times because of a change or degradation in the underlying Hosts, Storage or Network.  Be careful with those red herrings.  When HA is implemented with a solid understanding of the underlying infrastructure and its variations, you can enjoy peace of mind in knowing the edge network services are highly available.

This article covers the following topics in regards to HA:
– Implementation considerations
– Troubleshooting commands
– Proactively monitoring HA via syslog


Edge HA topology graphic from the NSX Network Virtualization design guide

A few HA facts/points/considerations/recommendations…

HA Topology
– It uses an Active/Standby topology.
– When HA is enabled, a second VM is deployed.  The new VM will only be networked to communicate with the primary.
– When HA is disabled, the 2nd VM is destroyed.
  – HA appliances will be deployed based on the user-defined mappings (at these these settings are not dynamic).
– Edge mappings are most easily managed using  /api/4.0/edges/<edgeId>/appliances with the REST api
– Changes appliance settings will trigger an OVF re-deployment of the edge.

HA IP Configuration
– Optional.  If not configured, NSX will assign a valid /30 IP pair using an RFC3927 network.
– If configured manually, valid subnets are system enforced. and is not valid. and is valid.

HA vNic Selection
– Optional, it can be left to ANY.
– A minimum of one edge interface is required before enabling HA.
– The recommendation for maximum availability is to configure a network dedicated to the vNIC heartbeating.
– Sharing a vNIC will work without problems as long as the network is not overloaded and available.

HA Timeouts and Heartbeating
  – The default deadtime is 6 seconds
– The current recommended deadtime is 15 seconds (uses a 3 second polling frequency).  There is a tradeoff of service failover time for increased resiliency to lost heartbeats.
– Heartbeats are sent using UDP-694 (the IANA registered port for heartbeats)

HA Appliance Anti-affinity
– Host anti-affinity is handled by system.  When HA is enabled there is a cluster DRS rule added automatically with the name anti-affinity-rule-edge-#, where edge=# is the edge-ID.
– Storage anti-affinity is not handled by default.  For maximum availability of the edge pair, configure the edge appliances to deploy to different physical storage resources.  Especially important in infrastructure that uses centralized storage.


Troubleshooting ESG HA with CLI-based Edge Commands

show service highavailability example output

 nsxe-0> show service highavailability
 Highavailability Status: running
 Highavailability Unit Name: nsxe-0
 Highavailability Unit State: active
 Highavailability Interface(s): vNic_5
 Unit Poll Policy:
    Frequency: 3 seconds
    Deadtime: 15 seconds
 Stateful Sync-up Time: 10 seconds
 Highavailability Healthcheck Status:
    Peer host [vse-1 ]: good
    This host [vse-0 ]: good
 Highavailability Stateful Logical Status:
 File-Sync running
 Connection-Sync running
 xmit        xerr  rcv       rerr
 51219548828 0     42990848  0

show service highavailability connection-sync example output

nsxe-0> show service highavailability connection-sync
connections local:
current active connections: 12693
connections created:            368613263  failed: 0
connections updated:           21695297    failed: 0
connections destroyed:        368600570  failed: 0

connections peer:
current active connections: 0
connections created:          26571 failed: 0
connections updated:         1024 failed: 0
connections destroyed:        26571 failed: 0

traffic processed:
1248602045934 Bytes 6285222215 Pckts

UDP traffic (active device=vNic_5):
51255382200 Bytes sent 43018912 Bytes recv
590146284 Pckts sent 2518471 Pckts recv
0 Error send 0 Error recv

message tracking:
0 Malformed msgs 5863 Lost msgs

show service highavailability connection-sync example output

vse-0> show service highavailability link
Local IP Address:
Peer IP Address:

debug packet display / “sniffing” HA heartbeats

Filter using the High Availability vNIC from the root command “show service highavailability”

nsxe-0> debug packet display interface vNic_# port_694
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on vNic_5, link-type EN10MB (Ethernet), capture size 65535 bytes
17:22:50.357722 IP > UDP, length 189
17:22:52.709253 IP > UDP, length 189
17:22:53.360327 IP > UDP, length 190
17:22:55.711667 IP > UDP, length 203
17:22:55.711715 IP > UDP, length 189
17:22:55.742631 IP > UDP, length 203
17:22:56.353520 IP > UDP, length 189
17:22:58.716886 IP > UDP, length 189
17:22:59.357186 IP > UDP, length 189

Viewing Historical HA System Events for an Edge in the Web Client


  • Open the vCentter Web Client
  • Open Networking & Security
  • In NSX Edges, double-click the Edge
  • Select the Montor tab
  • Select System Events
  • On the search widget, click the arrow, click Select Columns…
  • Deselect All > Check Module > Type HighAvailability > Click Ok

REST API-based Commands

Query HA Configuration Details on an Edge

GET https://nsxm-ip/api/4.0/edges/edge-#/highavailability/config Example Output

<?xml version="1.0" encoding="UTF-8"?>

Delete Edge HA Configuration on an Edge

DELETE https://nsxm-ip/api/4.0/edges/edge-#/highavailability/config


Monitoring High Availability Health Proactively

– Open your (vCenter Log Insight ), Splunk or log aggregation solution of choice.
– Build aview of all edge logging (use regex or glob based matches to filter according to your naming convention).

Heartbeat Drops
– Examine matches on the text “lost packet”. Build an alerting rule based on your results.
– When the infrastructure is healthy, there should be not be any HA packets lost.

Example match

Sep 19 11:34:14 nsxe-0 ha[]: [default]: [1371]: WARN: 1 lost packet(s) for [nsxe-0] [37:39]

Late Heartbeats

– Examine matches on the text “Late heartbeat”. Build an alerting rule based on your results.
– Late heartbeats may indicate infrastructure problems.  Possible resource constraints or both edges in the HA pair.
– This can also result in a split brain state.

Example match

Jul  3 09:46:48 nsxe-0 heartbeat: [1454]: WARN: Late heartbeat: Node
nsxe-1: interval 24921 ms

Lost and late heartbeats are the early indicators.  Early indicators are your best friends.  Keep a close eye out for these.

Monitor NSX Manager for Switchover Events

– Filter logging based on NSX Manager SystemEvent, you can use the text [SystemEvent] to filter.
– Examine matches for Event 30202 and 30203 (Edge switching to ACTIVE & STANDBY, respectively)
– Any single event source with more than one or two events should raise a red flag. Any unplanned switchover events should be researched. Build an alerting rule based on your findings.

Example match

Sep 20 20:50:05 nsxm-0 [SystemEvent] Time:'Sat Sep 20 20:49:13.000 GMT 2014', Severity:'High', Event Source:'vm-13950', Code:'30203', Event Message:'vShield Edge HighAvailability switch over happened. VM has moved to STANDBY state.', Module:'vShield Edge HighAvailability'

Split-Brain Indicators 

– Look for the text “returning after partition”; Look for the text “Deadtime value may be too small”
– Matches on these can indicate that the state of HA has most likely entered the split brain state.  Network Services will be mostly unavailable until the condition is resolved.
– Hopefully these do not exist in your environment. Build a preventive alerting rule. Matches are immediately actionable.

That is all folks. Hope this helps.

NSX SSL VPN-Plus | Adding Client Configurations in Bulk

Anyone using NSX SSL VPN-Plus feature for more than one site will quickly find there is no mechanism for importing client configurations.  The native method for accessing additional sites is to browse to the Gateway for each site (then download and run the installer).

That’s pretty tedious as your site count increases.  There is a better, albeit unsupported, way to manage this need.

SSL VPN-Plus naclient on Windows

In windows, client configuration is stored in the registry.  You can manipulate the windows registry using .reg files.

Open up a text editor, and prepare a file with all of your sites using the following format.  Replace the GatewayList value with your site’s gateway IP address

 Windows Registry Editor Version 5.00

 [HKEY_LOCAL_MACHINE\SOFTWARE\VMware, Inc.\SSL VPN-Plus Client\Connection #1]

 [HKEY_LOCAL_MACHINE\SOFTWARE\VMware, Inc.\SSL VPN-Plus Client\Connection #2]

 [HKEY_LOCAL_MACHINE\SOFTWARE\VMware, Inc.\SSL VPN-Plus Client\Connection #20]

Save the file as a .reg file, the name of the file is arbitrary.

Exit the SSL VPN-Plus naclient application

Import the .reg file

Navigate to HKLM\SOFTWARE\VMware, Inc.\SSL VPN-Plus Client and verify the connections were imported.

Update the ConnectionCount to the total number of sites.  This is important; if the number doesn’t match, naclient will not start.

Start the naclient (C:\Program Files\VMware\SSL VPN-Plus Client\SVPclient.exe)

SSL VPN-Plus naclient on MAC OS X 

This one is easier, the client settings are stored in /opt/sslvpn-plus/naclient/naclient.conf

Quit the naclient application.  Add the site configurations to naclient.conf

vi /opt/sslvpn-plus/naclient/naclient.conf
 site1 site1-ip:443 256 
 site2 site2-ip:443 256 
 site20 site20-ip:443 256

Start the naclient.

That is all peeps.  Have a nice day.


[ Browse previous post –  VDCA550 OBJECTIVE 1.1 – 1.3 (IMPLEMENT AND MANAGE STORAGE) ]

Here are my notes for the Networking section of the blue print, after tons of reading and lab time.  Again – I am heavily relying on the VCAP5-DCA Official Cert Guide (OCG), and the vSphere 5.5 Documentation Center. 


Objective 2.1 Implement and Manage Virtual Standard Switch (VSS) Networks

 Create and Manage VSS Components – OCG page 48


# Managing the VSS in the GUI – Options mapped out

VC > Host > Configuration > Networking > vSphere Standard Switch

ALL Available Options:

Networking - Refresh _____________________ Refreshes the Networking View
Networking - Add Networking… ____________ Opens the Add Network Wizard, options below
 - Virtual Machine Portgroup ___________ Choose/create vSwitch, label, vlan-id,
 - VMkernel - choose/create vSwitch ____ Label, vlan-id, mark for vmotion/ft/mgmt, 
                                         ip/ipv6/both, IP assignment
Networking - Properties…_____________________ Checkbox, Enables IPv6 Support on the host system
 vSwitch Port/Portgroup bubbles ___________ Displays the properties: General, Security,
                                         Traffic-Shaping, Failover-LB/NIC
 vSwitch - Remove… _______________ Deletes the vswitch
 vSwitch - Properties…
 * Network Adapters tab
 - Add… _________________ Add an unused physical network adapter (vmnic) to the vswitch.
 - Edit… ________________ Set the NIC speed/duplex settings.
 - Remove…_______________ Unassigns the vmnic from the vswitch
 * Ports Tab
 - Add…____________ Opens the Add Network Wizard [minus the vSwitch selection, same options]
 - Remove…_________ Delete the selected port/portgroup
 - Edit vSwitch… Opens the properties for the selected port/portgroup [4 tabs listed below]
    - General
        Number of Ports ___ Drop-down options: 24, 56, 120, 248, 504, 1016, 2040, 4088
        MTU _______________ 1500 - 9000
    - Security
        Promiscuous Mode ____ Accept - VM adapter receives all traffic on the wire. 
                              Reject - default operation
        MAC Addr Changes ____ Reject disables rx-vm traffic on init/effective MAC mismatch.                               Sw iSCSI initiator requires accept.
        Forged Transmits ____ Reject - Host drops tx traffic on init/effective MAC mismatch.
                              Accept - host says I accept whatevs
    - Traffic Shaping
        Status ______________ Enabled = Applied to each virtual network adapter 
        Avg Bandwidth _______ Bps allowed across a port, averaged over time.
        Peak Bandwidth ______ In Kbits/sec; Allowed range is 1 to 9223372036854775 Kbits 
                              That is ~ 1Million Terabytes
        Burst Bandwidth _____ Burst bonus gained when not all allocated bandwidth is used
    - NIC Teaming
        Load Balancing ___________ Dropdown: Originating Virtual Port ID / IP Hash / Source                                    MAC hash / explicit failover order
        Network Failover Detect __ Dropdown: Link status only / Beacon probing
        Notify Switches __________ Yes / No
        Failback _________________ Yes / No
         Failover Order __________ NIC Failover Function: Active/Standby/Unused Adapters

 - Edit Portgroup/VMKnet… Configurations here override the vSwitch-level configurations.
    - General
        Network Label _______________ Network Name
        Vlan ID _____________________ Specify the VLAN
        VMkernel Int-only settings __ Checkboxes for vMotion, Fault Tolerance Logging, Mana-                                      gement/iSCSI Port Binding/MTU
    - Security
        Promiscuous Mode ____ Accept - VM adapter receives all traffic on the wire. 
                              Reject - default operation
        MAC Addr Changes ____ Reject disables rx-vm traffic on init/effective MAC mismatch.
                              Sw iSCSI initiator requires accept.
        Forged Transmits ____ Reject - Host drops tx traffic on init/effective MAC mismatch.
                              Accept - host says I accept whatevs
    - Traffic Shaping
        Status __________ Enabled - Applied to each virtual network adapter / Disabled
        Avg Bandwidth ___ Bps allowed across a port, averaged over time.
        Peak Bandwidth __ In Kbits/sec; Allowed range is 1 to 9223372036854775 Kbits
                          that is, ~ 1Million Terabytes
        Burst Bandwidth _ Burst bonus gained when not all allocated bandwidth is used
    - NIC Teaming
        Load Balancing ___________ Dropdown: Originating Virtual Port ID / IP Hash / 
                                   Source MAC hash / explicit failover order
        Network Failover Detect __ Dropdown: Link status only / Beacon probing
        Notify Switches __________ Yes / No
        Failback _________________ Yes / No
        Failover Order ___________ Active/Standby/Unused Adapters ; Select vmnic, Move Up / Move Down

 Create and Manage Vmkernel Ports on Standard Switches

# Configuration/Management in the GUI (details in first section)
VC > Host > Configuration > Networking

# Managing Vmkernel ports in the CLI (commands with sample output)
# Query the tags on a vmknic

 ~# esxcli network ip interface tag get -i vmk4
 Tags: Management, VMotion, faultToleranceLogging

# Query the ipv4 summarized information for all vmkernel interfaces

~ # esxcli network ip interface ipv4 get
Name IPv4 Address IPv4 Netmask IPv4 Broadcast Address Type DHCP DNS
---- ------------ ------------- -------------- ------------ --------
vmk0 STATIC false
vmk1 STATIC false
vmk2 STATIC false

# Add a vmkernel interface to a vswitch’s port group

~ esxcli network ip interface add --portgroup-name

# Set the ipv4 information on an existing vmkernel interface

~ # esxcli network ip interface ipv4 set -i vmk4 -I -N -P false
~ # esxcli network ip interface ipv4 get
Name IPv4 Address IPv4 Netmask IPv4 Broadcast Address Type DHCP DNS
---- ------------ ------------- -------------- ------------ --------
vmk4 STATIC false

# Edit the enabled status & MTU of an existing vmkernel interface; e=enabled , i=interface-name , m=MTU

~ # excli network ip interface set -e [true|false] -i vmk# -m 1500

 Configure advanced vSS Settings – OCG Page 66

# Configuration/Management in the GUI (details in first section)
VC > Host > Configuration > Networking

# Managing vSwitches in the CLI (commands with sample output)
# Query all standard vswitch commands

~ # esxcli esxcli command list | grep vswitch.standard
 network.vswitch.standard add
 network.vswitch.standard list
 network.vswitch.standard remove
 network.vswitch.standard set
 network.vswitch.standard.policy.failover get
 network.vswitch.standard.policy.failover set
 network.vswitch.standard.policy.security get
 network.vswitch.standard.policy.security set
 network.vswitch.standard.policy.shaping get
 network.vswitch.standard.policy.shaping set
 network.vswitch.standard.portgroup add
 network.vswitch.standard.portgroup list
 network.vswitch.standard.portgroup remove
 network.vswitch.standard.portgroup set
 network.vswitch.standard.portgroup.policy.failover get
 network.vswitch.standard.portgroup.policy.failover set
 network.vswitch.standard.portgroup.policy.security get
 network.vswitch.standard.portgroup.policy.security set
 network.vswitch.standard.portgroup.policy.shaping get
 network.vswitch.standard.portgroup.policy.shaping set
 network.vswitch.standard.uplink add
 network.vswitch.standard.uplink remove

# Query global settings

~ # esxcli network vswitch standard list
Name: vSwitch0
Class: etherswitch
Num Ports: 1536
Used Ports: 11
Configured Ports: 128
MTU: 1500
CDP Status: listen
Beacon Enabled: false
Beacon Interval: 1
Beacon Threshold: 3
Beacon Required By:
Uplinks: vmnic0
Portgroups: vmk1-iscsi, VM Network, Management Network

# Query vswitch policy details

~ # esxcli network vswitch standard policy failover get -v vSwitch0
Load Balancing: srcport
Network Failure Detection: link
Notify Switches: true
Failback: true
Active Adapters: vmnic0
Standby Adapters:
Unused Adapters:

~ # esxcli network vswitch standard policy security get -v vSwitch0
Allow Promiscuous: false
Allow MAC Address Change: true
Allow Forged Transmits: true

~ # esxcli network vswitch standard policy shaping get -v vSwitch0
Enabled: false
Average Bandwidth: -1 Kbps
Peak Bandwidth: -1 Kbps
Burst Size: -1 Kib

# Query vswitch portgroups

~ # esxcli network vswitch standard portgroup list
Name Virtual Switch Active Clients VLAN ID
------------------ -------------- -------------- -------
Management Network vSwitch0 1 0
My VMK Interface vSwitch3 1 1234
Prod-201 vSwitch3 1 201
VM Network vSwitch0 4 0

# Query switch port group policy details [works with failover/security/shaping policies]

~ # esxcli network vswitch standard portgroup policy security get -p 'VM Network'
Allow Promiscuous: true
Allow MAC Address Change: true
Allow Forged Transmits: true
Override Vswitch Allow Promiscuous: true
Override Vswitch Allow MAC Address Change: false
Override Vswitch Allow Forged Transmits: false

# Add Standard vSwitch named uber-vswitch with 2000 ports (default to128 configured ports, maximum 4096)

~ # esxcli network vswitch standard add -P 2000 -v uber-vswitch

# add two uplinks to uber-vswitch

~ # esxcli network switch standard uplink add -u vmnic0 -v uber-vswitch
~ # esxcli network switch standard uplink add -u vmnic1 -v uber-vswitch

# Set the MTU on uber-vswitch to 9000

~ # esxcli network switch standard set -m 9000 -v uber-vswitch

# Add a portgroup named uber-PG to uber-vswitch, configure the pg to tag with Vlan 100

~ # esxcli network switch standard portgroup add -p uber-PG -v uber-vswitch
~ # esxcli network switch standard portgroup set -p uber-PG -v 100

# Configure iphash policy with disabled switch notifications, and traffic shaping ~100mb on the uber-PG port group

~ # esxcli network switch standard portgroup policy failover set -p uber-PG -l iphash -n false
~ # esxcli network switch standard portgroup policy shaping set -p uber-PG -e true -b 100000 -k 150000 -t 200000

# About vSwitch NIC Teaming LB Options

explicit ______ Always use the highest order uplink from the list of active adapters which pass failover criteria.
iphash _______ Route based on hashing the src and destination IP addresses
mac Route ___ based on the MAC address of the packet source.
portid Route __ based on the originating virtual port ID.



Objective 2.2 Implement and Manage Virtual Distributed Switch (VDS) Networks

Determine use cases for and applying VMware DirectPath I/O – OCG Page 61


DirectPath I/O “Passthrough”

Use case: Supporting extremely heavy network activity within a VM, when no other methods are sufficient.

 Migrate a vSS Network to a Hybrid or Full vDS Solution – OCG Page 62

#1 Create vDS, don’t migrate hosts or adapters
VC > Networking > Right Click DC > New vSphere Distributed Switch

#2 Prepare destination PortGroups for any existing networks
VC > Networking > vDS > Configuration > New Port Group...

#3 Connect Hosts
VC > Networking > vDS > Add Host…

#4 Select adapters
- Select the physical adapters
- For each VMkernel interfaces, choose the Destination port groups prepared.

#5 Migrate VM networking
- Check “Migrate virtual machine networking
- Select the Destination port group for each vm-network

#6 Click Finish

 Configure vSS and vDS Settings Using Command Line Tools – OCG Page 80

Not a lot regarding this.here are the available(mostly read) CLI commands for the DVS

~ # esxcli esxcli command list | grep network.vswitch.dvs
network.vswitch.dvs.vmware.lacp.config get
network.vswitch.dvs.vmware.lacp.stats get
network.vswitch.dvs.vmware.lacp.status get
network.vswitch.dvs.vmware.lacp.timeout set
network.vswitch.dvs.vmware list
network.vswitch.dvs.vmware.vxlan.config.stats get
network.vswitch.dvs.vmware.vxlan.config.stats set
network.vswitch.dvs.vmware.vxlan get
network.vswitch.dvs.vmware.vxlan list
network.vswitch.dvs.vmware.vxlan.network.arp list
network.vswitch.dvs.vmware.vxlan.network.arp reset
network.vswitch.dvs.vmware.vxlan.network list
network.vswitch.dvs.vmware.vxlan.network.mac list
network.vswitch.dvs.vmware.vxlan.network.mac reset
network.vswitch.dvs.vmware.vxlan.network.mtep list
network.vswitch.dvs.vmware.vxlan.network.port list
network.vswitch.dvs.vmware.vxlan.network.port.stats list
network.vswitch.dvs.vmware.vxlan.network.port.stats reset
network.vswitch.dvs.vmware.vxlan.network.stats list
network.vswitch.dvs.vmware.vxlan.network.stats reset
network.vswitch.dvs.vmware.vxlan.stats list
network.vswitch.dvs.vmware.vxlan.stats reset
network.vswitch.dvs.vmware.vxlan.vmknic list
network.vswitch.dvs.vmware.vxlan.vmknic.multicastgroup list
network.vswitch.dvs.vmware.vxlan.vmknic.stats list
network.vswitch.dvs.vmware.vxlan.vmknic.stats reset

 Analyze Command Line Output to Identify vSS and vDS Configuration Details

# Config detail from esxcli

~ # esxcli network vswitch dvs vmware list
Name: grosas-lab-dvs0
VDS ID: 01 2f 16 50 eb 4a 7d 3d-d6 5a 7d 55 05 27 76 5b
Class: etherswitch
Num Ports: 1536
Used Ports: 1
Configured Ports: 512
MTU: 1500
CDP Status: listen
Beacon Timeout: -1
VMware Branded: true
DVPortgroup ID: dvportgroup-77
In Use: false
Port ID: 0

# Config detail from net-dvs

~# net-dvs-l
switch 01 2f 16 50 eb 4a 7d 3d-d6 5a 7d 55 05 27 76 5b (etherswitch)
 max ports: 1536
 global properties:
 com.vmware.common.version = 0x 3. 0. 0. 0
 propType = CONFIG
 idle timeout = 15 seconds
 active timeout = 60 seconds
 sampling rate = 0
 collector =
 internal flows only = false
 propType = CONFIG
 propType = CONFIG
 propType = CONFIG
 com.vmware.common.alias = grosas-lab-dvs0 , propType = CONFIG
 propType = CONFIG
 com.vmware.etherswitch.mtu = 1500 , propType = CONFIG
 com.vmware.etherswitch.cdp = CDP, listen
 propType = CONFIG
 host properties:
 com.vmware.common.host.portset = DvsPortset-0 , propType = CONFIG
 com.vmware.common.host.volatile.status = green , propType = RUNTIME
 com.vmware.common.portset.opaque = false , propType = RUNTIME
 propType = CONFIG
 port 0:
 com.vmware.common.port.alias = dvUplink1 , propType = CONFIG
 com.vmware.common.port.connectid = 0 , propType = CONFIG
 com.vmware.common.port.volatile.status = free
 com.vmware.common.port.volatile.vlan = VLAN 0
 com.vmware.common.port.portgroupid = dvportgroup-77 , propType = CONFIG
 com.vmware.common.port.block = false , propType = CONFIG
 com.vmware.common.port.dvfilter = filters (num = 0):
 propType = CONFIG
 com.vmware.common.port.ptAllowed = 0x 0. 0. 0. 0
 propType = CONFIG
 load balancing = source virtual port id
 link selection = link state up;
 link behavior = notify switch; best effort on failure; shotgun on failure;
 active =
 standby =
 propType = CONFIG
 com.vmware.etherswitch.port.security = deny promiscuous; deny mac change; allow forged frames
 propType = CONFIG
 com.vmware.etherswitch.port.vlan = Guest VLAN tagging
 ranges = 0-4094
 propType = CONFIG
 com.vmware.etherswitch.port.txUplink = normal , propType = CONFIG
 pktsInUnicast = 0
 bytesInUnicast = 0
 pktsInMulticast = 0
 bytesInMulticast = 0
 pktsInBroadcast = 0
 bytesInBroadcast = 0
 pktsOutUnicast = 0
 bytesOutUnicast = 0
 pktsOutMulticast = 0
 bytesOutMulticast = 0
 pktsOutBroadcast = 0
 bytesOutBroadcast = 0
 pktsInDropped = 0
 pktsOutDropped = 0
 pktsInException = 0
 pktsOutException = 0
 propType = RUNTIME
 propType = CONFIG

 Configure Netflow – OCG Page 68


WC > DVS > Right click > All vCenter Actions - Edit Netflow > Provide collector IP/Port > Give DVS Switch IP Address

– Optional: Active flow export timeout
– Optional: Idle flow export timeout
– Sampling Rate

The sampling rate represents the number of packets that NetFlow drops after every collected packet. A sampling rate of xinstructs NetFlow to drop packets in a collected packets:dropped packets ratio 1:x. If the rate is 0, NetFlow samples every packet, that is, collect one packet and drop none. If the rate is 1, NetFlow samples a packet and drops the next one, and so on.

Determine Appropriate Discovery Protocol – OCG Page 68


Use CDP for Cisco Switches / LLDP for everything else…

WC > DVS > Manage > Settings > Properties > Edit > Advanced > Type: CDP/LLDP | Operation: Listen/Advertise/Both

 Determine Use Cases for, and Configure PVLANs – OCG Page 69


WC > DVS > Manage > Settings > Private VLAN > Edit

– Define the Primary VLAN ID (VLAN Type Promiscuous)
– Define the Secondary VLANs (VLAN Type Community or Isolated)

Use Case: Private VLANs are used to solve VLAN ID limitations and waste of IP addresses for certain network setups.
A private VLAN is identified by its primary VLAN ID. A primary VLAN ID can have multiple secondary VLAN IDs associated with it. Primary VLANs are Promiscuous, so that ports on a private VLAN can communicate with ports configured as the primary VLAN. Ports on a secondary VLAN can be either Isolated, communicating only with promiscuous ports, or Community, communicating with both promiscuous ports and other ports on the same secondary VLAN.

 Use Command Line Tools to Troubleshoot and Identify VLAN Configurations – OCG Page 73

# Check Vlan IDs for portgroups

~ # esxcli network vswitch standard portgroup list
 Name Virtual Switch Active Clients VLAN ID
 ------------------ -------------- -------------- -------
 Management Network vSwitch0 1 0
 My VMK Interface vSwitch3 1 1234
 Prod-201 vSwitch3 1 300

# Change a Vlan ID on portgroup Prod-201

~ # esxcli network vswitch standard portgroup set -p Prod-201 -v 201



Objective 2.3 Troubleshoot Virtual Switch Solutions

 Understand the NIC Teaming failover types and related physical network settings – OCG Page 74

Edit Teaming and Failover Policy for a vSphere Standard Switch in the vSphere Web Client
Edit the Teaming and Failover Policy on a Standard Port Group in the vSphere Web Client
Edit the Teaming and Failover Policy on a Distributed Port Group in the vSphere Web Client
Edit Distributed Port Teaming and Failover Policies with the vSphere Web Client

Route based on Originating Virtual Port ID
– This is the default policy.
– The vSwitch assigns the VM’s virtual network adapter to a port number and uses the port number to determine which path will be used to route all network I/O sent from that adapter.
– This implementation does not require any changes on the connected physical switches.
– The vSwitch performs a modulo function, where the Port number is divided by the number of NICs in the team, and the remainder indicates the path to place the outbound I/O.
– If the path fails, the outbound I/O is automatically re-routed to a surviving path.
– This policy does not permit outbound data from a single virtual adapter to be distributed across all active paths on the vSwitch.

The Route based on Originating Virtual Port ID algorithm does not consider load into its calculation for traffic placement

Route based on Source MAC Hash
– This policy uses the MAC address of the virtual adapter to select the path, rather than the port number.
– The vSwitch performs a modulo function, where the MAC address is divided by the number of NICs in the team, and the remainder indicates the path to place the outbound I/O.

The Route based on Source MAC Hash algorithm does not consider load into its calculation for traffic placement.

Route based on IP Hash
– This is the only option that permits outbound data from a single virtual adapter to be distributed across all active paths.
– This option requires that the physical switch be configured for IEEE802.3ad “Link Aggregation”
– The vSwitch must be configured for IP Hash for inbound load balancing.
– The outbound data from each virtual adapter is distributed across the active paths using the calculated IP hash.
– If a virtual adapter is concurrently sending data to two or more clients, the I/O to one client can be placed on one path and the I/O to another client can be placed on a separate path.
– The outbound traffic from a virtual adapter to a specific external client is based on the most significant bits of the IP address of both the virtual adapter and the client. The combined value is used by the vSwitch to place the associated outbound traffic on a specific path.

The Route based on IP Hash algorithm does not consider load into its calculation for traffic placement. But the inbound traffic is truly load balanced by the physical switch.

Route based on Physical NIC Load (DVS Only)
– Factors the load of the physical NIC when determining traffic placement.
– Does not require special settings on the physical switch
– Initially, outbound traffic is placed on a specific path. Activity is monitored.
– When I/O through a specific vmnic adapter reaches a consistent 75% capacity, then one or more virtual adapters are automatically remapped to other paths.
– This is a good choice when Etherchannel on the physical switch is not feasible.

 Determine and Apply Failover Settings – OCG Page 77


WC > Manage > Networking > Virtual Switches > Edit Settings > Teaming and Failover
WC > DVS > Manage > Ports > Edit Distributed Port Settings

Network Failover Detection

# Link status Only
Relies only on the link status that the network adapter provides.
– Detects removed cables & physical switch port failures.
– Does not detect a physical switch port that is blocked by spanning tree or is misconfigured.
– Does not detect a pulled cable that connects a physical switch to another device.

# Beacon Probing
Sends out and listens for beacon probes on all NICs in the team and uses this information, in addition to link status, to determine link failure. ESX/ESXi sends beacon packets every second.
– Useful with teams of more than 3 nice, allows n-2 failures
– NICs must be in active/active or active/standby, NICs in unused state do not participate in beacon probing.

Notify Switches Yes/No – If Yes, a notification is sent over the network to update the lookup tables on the physical switches.
Set to No for features like Microsoft NLB in unicast mode.

 Configure Explicit Failover to Conform with VMware Best Practices – OCG Page 77

Override switch failover order to manually specify which NICs are Active / Standby / Unused.


Configure Port Groups to Properly Isolate Network Traffic – OCG Page 79

– VMware recommends that each type of network traffic is separated by VLANs.
– Separate VLANs for Management, vMotion, VMs, iSCSI, NAS, VMware HA Heartbeat, Fault Tolerance logging.
– Trunk the VLANs on the physical switch.


Given a Set of Network Requirements, Identify the Appropriate Distributed Switch Technology to Use – OCG Page 81

# VDS features



Switch/Network Discovery [CDP / LLDP]

Network Rollback and Recovery

Port Mirroring
   Switched Port Analyzer[SPAN]
   Remote Switched Port Analyzer [RSPAN]
   Enhanced Remote Switched Port Analyzer (ERSPAN)
Port Security

TCP Segmentation Offload / Jumbo Frames

Single-Root I/O Virtualization (SR-IOV)

Traffic Filtering [ACL]


Configure and Administer vSphere Network I/O Control – OCG Page 83

Conveniently I have blogged about this one, and deployed it in production… and I’m running out of steam.




Use Command Line Tools to Troubleshoot and Identify Configuration Items From an Existing vDS

Already covered under Analyze Command Line Output to Identify vSS and vDS Configuration Details

VDCA550 Objective 1.1 – 1.3 (Implement and Manage Storage) in One Dense Post

Because sharing is caring.  Here are my notes after tons of reading and lab time.  Heavily using the VCAP5-DCA Official Cert Guide (OCG), the vSphere 5.5 Documentation Center.  Supplementing with blogs and youtube anywhere my main sources fall short.

Extra special thanks to Chris Wahl for his Study Sheets.  They are helping me tons with managing my time.  I’m using the VDCA550 version.

Objective 1.1 Implement Complex Storage Solutions


VMware DirectPath I/O – OCG Page 101


“VM access to PCI devices”

# Configuring in GUI – Video Demo

# Pre-reqs
– Intel VT-d or AMD IOMMU enabled in BIOS
– Devices connected and marked as available for passthrough
– VM Hardware version 7

# Enabling in the GUI
VC > Host > Configuration > Hardware > Advanced Settings > Configure Passthrough (add a PCI device)
VC > VM > Edit Settings > Add > Add the PCI device.


N-Port Virtualization (NPIV) – OCG Page 99


“WWN at VM level”

# Pre-reqs
– Only on VMs with RDM disks (VMs with reg disks use WWN of the Host’s HBAs.
– HBA on host must support NPIV
– Fabric switches must be NPIV-aware

# Capabilities & Limitations
– vMotion supported; vmkernel reverts to physical hba if destination host ors not support NPIV.
– Concurrent I/O supported.
– Requires FC switch
– Clones do not retain WWN
– Does not support Storage vMotion
– Disabling and Re-enabling NPIC capability on FC Switch while VM running can cause FC link to fail and I/O to stop.

# Configuring in the GUI
VC > VM > Edit Settings > Options Tab > Advanced – Fibre Channel NPIV
WC > VM > Edit Settings > VM Options > Expand FC NPIV triangle> Deselect “Temporarily Disable NPIV for this VM > Generate new WWN


Raw Device Mappings (RDM) – OCG Page 98


“An RDM allows a VM to directly utilize a LUN”

# Considerations & Limitations
– RDM is not available for directly attached block devices.
– Snapshots are not supported in

# Configuring in GUI
VC > VM > Edit Settings > Hardware – Add > Hard Disk > Type: Raw Device Mappings > Select LUN > Select datastore


Configure vCenter Server Storage Filters (Storage Profiles) – OCG Page 102


“vCenter Server provides storage filters to help you avoid storage device corruption or performance degradation that can be caused by an unsupported use of storage devices.”

# Configuring in the GUI
VC > Administration > vCenter Server Settings > Advanced Settings
WC > VC Server > Manage > Settings > Advanced Settings > Edit

(filters by default are not listed and are TRUE)

Add the key – In the Value box, type False > Add > OK


VMFS re-signaturing – OCG Page 104


“When resignaturing a VMFS copy, ESXi assigns a new UUID and a new label to the copy, and mounts the copy as a datastore distinct from the original.”

# Resignaturing in the GUI

# Checking UUID
esxcli storage vmfs extent list
vmkfstools -P -h [datastoreName]

# Checking UUID in the GUI
VC > Datastores and DS Clusters > Configuration > Datastore Details > Location

# Resignaturing with GUI
VC > Host > Configuration > Storage > Add Storage… > Select Disk/LUN > Select Datastore > Mount options > Assign New Signature
VC > Host > Configuration > Storage > Add Storage… > Select Disk/LUN > Select Datastore > Mount options > Keep Existing Signature.

# Resignaturing with esxcli
esxcli storage vmfs snapshot list
esxcli storage vmfs snapshot mount -l ‘datastore-volume-label’
esxcli storage vmfs snapshot resignature -l ‘datastore-volume-lable’


Understand and apply LUN masking using PSA-related commands – OCG Page 127, Page 191

# Applying LUN Masking


# Changing the Path Selection Plugin for a Storage Array Type Plugin
/vmfs/volumes # esxcli storage nmp satp set -s VMW_SATP_CX -P VMW_PSP_RR
Default PSP for VMW_SATP_CX is now VMW_PSP_RR

# List devices
esxcli storage vmfs extent list

# List paths
esxcli storage nmp path list

# List all claim rules
esxcli storage core claimrule list

# Claimrule based on Fiber Channel
esxcli storage core claimrule add -u -P MASK_PATH -t transport -R fc

# Claimrule rule #333 masking on adapter , channel, target, lun
esxcli storage core claim rule add -r 333 -P MASK_PATH -t location -A vmhba32 -C 0 -T 0 -L 0

# Load the claim rules into runtime
esxcli storage core claimrule load
esxcli storage core claimrule run

# Reclaim a lun
esxcli storage core claiming reclaim -d

# Remove a rule
esxcli storage core claimrule remove -r 333
esxcli storage core claimrule load
esxcli storage core claiming unclaim -t location -A vmhba2 -C 0 -T 0 -L 2

# LUN Masking in the GUI
No GUI method exactly matches the commands above.
– Native Multipathing (NMP) paths can be enabled/disabled
– Path Selection Policy (PSP) can be configured (Fixed, Most Recently Used, Round Robin)

VC > Hosts and Clusters > Host > Configuration > Storage > View Devices > Manage Paths
Web Client > Storage > Datastore > Manage > Settings > Connectivity and Multipathing


Configure iSCSI Port Binding – OCG Page 123

# Configuring iSCSI Port Binding Video Demo

# Adding the Software iSCI adapter
host > configuration > storage adapters > Add > Select “Add Software iSCSI adapter” > OK

# Adding the iSCI vmkernel interface
VC > host > configuration > networking > vSphere Standard Switch > Add Networking… VMkernel > Select vSwitch

#Configure the Storage Adapter
VC > Hosts and Clusters > host > configuration > storage adapters > select the iSCSI Software Adapter > Properties


vSphere Flash Read Cache (Not covered in printed OCG – Covered in supplemental Appendix C)


“Performance enhancement of read-intensive applications by providing a write-through cache for virtual disks. It uses the Virtual Flash Resource, which can be built on Flash-based, solid-state drives (SSDs) that are installed locally in the ESXi hosts.”

# Configuring vFRC On the host
Web Client > Hosts & Clusters > Host > Manage > Settings > Virtual Flash > Virtual Flash Resource Management > Add Capacity

# Configure a VM with vFRC
Web Client > VM > Edit Settings > Select/Expand Hard Disk > Enter qty or FRC > OK

# Configure Host Cache in GUI
WC > Host > Manage > Storage > Host Cache Configuration > Select DS > Allocate space for host cache.


Configure Datastore Cluster – OCG Page 120

# Configure DS Cluster in the GUI
VC > Storage > New DS Cluster > Storage DRS Automation level > Select Runtime settings/IO inclusion > Select Clusters/Hosts > Select Datastores
WC > Right click DC Object > New DS Cluster


Upgrade VMware Storage Infrastructure – OCG Page 115

# Upgrade datastores in the GUI
VC > Datastore > Configuration > Upgrade (Option does not appear if running latest)
WC > Datastore > Manage > Settings > General > Properties (Option does not appear if running latest)



Objective 1.2 Manage Complex Storage Solutions


Analyze I/O workloads to determine storage performance requirements OCG Page 168, 188, 196


# List VM World GID Info
vscsiStats -l

# Collect stats on GID 42155
vscsiStats -s -w 42155 (s to start collection, w to specify the GID)

# Display stats
vscsiStats -p {type} (Type options all, ioLength, seekDistance, outstandingIOs, latency, interarrival)

# Stop all collection
vscsiStats -x

# View host level statistics, examine disk adapter stats
esxtop > d

# View LUN level statistics
esxtop > u

# View VM level disk stats
esxtop > v

* CMDS/s – This is the total amount of commands per second, which includes IOPS and other SCSI commands (e.g. reservations and locks). Generally speaking CMDS/s = IOPS unless there are a lot of other SCSI operations/metadata operations such as reservations.
* DAVG/cmd – This is the average response time in milliseconds per command being sent to the storage device.
* KAVG/cmd – This is the amount of time the command spends in the VMKernel.
* GAVG/cmd – This is the response time as experienced by the Guest OS. This is calculated by adding together the DAVG and the KAVG values.

As a general rule DAVG/cmd, KAVG/cmd and GAVG/cmd should not exceed 10 milliseconds (ms) for sustained lengths of time.
There are also the following throughput metrics to be aware of:

* CMDS/s – As discussed above
* READS/s – Number of read commands issued per second
* WRITES/s – Number of write commands issued per second
* MBREAD/s – Megabytes read per second
* MBWRTN/s – Megabytes written per second

The sum of reads and writes equals IOPS, which is the the most common benchmark when monitoring and troubleshooting storage performance. These metrics can be monitored at the HBA or Virtual Machine level.


Identify and tag SSD and local devices – page 133


# Identify SSD in the GUI
VC > Host > Configuration > Hardware – Storage > Datastores > Drive Type
WC > Host > Manage > Storage > Storage Devices > Drive Type

# Identify the device to be tagged and its SATP, command & example output

esxcli storage device nmp device list (note the SATP)
 Device Display Name: DGC Fibre Channel Disk (naa.6006016015301d00167ce6e2ddb3de11)
 Storage Array Type: VMW_SATP_CX
 Storage Array Type Device Config: {navireg ipfilter}
 Path Selection Policy: VMW_PSP_MRU
 Path Selection Policy Device Config: Current Path=vmhba4:C0:T0:L25
 Working Paths: vmhba4:C0:T0:L25

# Add a PSA claim rule

## By Device Name
esxcli storage nmp satp rule add -s VMW_SATP_CX -d device_name —o enable_ssd

## Add By Vendor / Model
esxcli storage nmp satp rule add -s VMW_SATP_CX -V vendor_name -M model_name -o enable_ssd

# Reclaim the device
esxcli storage core claiming reclaim —d [devicename]

# Check if device is tagged SSD
esxcli storage core device list -d device_name


Administer hardware acceleration for VAAI – OCG Page 106


VAAI = vSphere Storage APIs Array Integration

“Hardware-acceleration / hardware offload APIs. Storage primitives that allow the host to offload storage operations”

# Full copy – Array performs copies without having to communicate with the host. Speeds up cloning/svmotion.

# Block zeroing . Array performs zeroing. Speeds up block zeroing process when new virtual disk is created

# Hardware-assisted locking. Enhanced locking. ATS replaces SCSI-2. More VMs per Datastore. More Hosts per LUN

# Configuring in GUI
VC > Host > Configuration > Software – Advanced Settings ; use the settings mentioned above; 0 will disable

# Checking for VAAI Support
VC > Host > Configuration > Storage > Hardware > Datastores View


Configure and administer profile-based-storage – OCG Page 109

“VM storage policies can be used during VM provisioning to ensure that the virtual disks are placed on proper storage. VM storage policies can be used to facilitate the management of the VM, such as during migrations, to ensure that the VM remains on compliant storage.”

# Configuration in GUI Video Demo

# 1) Enable the feature on the host/cluster
VC > Home > Management – VM Storage Profiles > Enable VM Storage Profiles > Select the Host/Cluster > Click Enable Storage Profiles > Close

# 2) Define User-defined Capabilities
VC > Home > Management – VM Storage Profiles > Manage Storage Capabilities > Add > Name the capability > OK

# 3) Create VM Storage Profile
VC > Home > Management – VM Storage Profiles > Create > Create new VM storage profile > Name the storage profile > Select a defined capability defined in #2 > Click Next > Click Finish

# 4) Assign User-defined Capabilities
VC > Right-click datastore > Assign User-Defined Storage Capabilitiy > Select a Storage Capability from the drop-down > click OK

# Test by creating new vm
VC > VMs & Templates > New VM > In storage section, use the drop-down, the view will filter datastore options into compatible/non-compatible options.


Prepare Storage for Maintenance – OCG Page 114


“ Datastore maintenance mode “

# Configuring SDRS MM
VC > Datastore > Right click datastore > Enter SDRS Maintenance mode
WC > Datastore > All vCenter Actions > Enter Storage DRS Maintenance mode


Apply Space Utilization Data to Manage Storage Resources


Provision and Manage Storage Resources According to VM Requirements


# Disk Formats:

■ Lazy-zeroed Thick (default) – Space required for the virtual disk is allocated during creation. Any data remaining on the physical device is not erased during creation, but is zeroed out on demand at a later time on first write from the virtual machine. The virtual machine does not read stale data from disk.
– Fast
– File block zeroed on write
– Fully pre-allocated on datastore

■ Eager-zeroed Thick – Space required for the virtual disk is allocated at creation time. In contrast to zeroedthick format, the data remaining on the physical device is zeroed out during creation. It might take much longer to create disks in this format than to create other types of disks.
– Slow – but faster with VAAI
– File block zeroed when disk is created.
– Fully preallocated on datastore.

■ Thin – Thin-provisioned virtual disk. Unlike with the thick format, space required for the virtual disk is not allocated during creation, but is supplied, zeroed out, on demand at a later time.
– Very Fast
– File block is zeroed on write.
– File block is allocated on write.

■ rdm:device – Virtual compatibility mode raw disk mapping.

■ rdmp:device – Physical compatibility mode (pass-through) raw disk mapping.

■ 2gbsparse – A sparse disk with 2GB maximum extent size. You can use disks in this format with hosted VMware products, such as VMware Fusion, Player, Server, or Workstation. However, you cannot power on sparse disk on an ESXi host unless you first re-import the disk with vmkfstools in a compatible format, such as thick or thin.


Understand Interactions Between Virtual Storage Provisioning and Physical Storage Provisioning

Reference Virtual Disk Format Types – OCG Page 95
Troubleshoot Storage Performance and Connectivity – OCG Page 188


Configure Datastore Alarms – OCG Page 117 (Datastore Alarms) Page 235 (SDRS Alams)

Create and Analyze Datastore Alarms and Errors to Determine Space Availability – OCG page 169, 188 – 201

# Configuring the alarm in the GUI
VC > Define the scope > Alarms Tab > Definitions > Right click Whitespace > New Alarm > Select type Datastore

# Define a trigger

## Datastore alarms support the following triggers:
– Datastore Disk Provisioned (%) >>> Is above / Is below >>> 50, 150, 200, etc (increments of 50)
– Datastore Disk Usage (%) >>> Is above / Is below >>> Defined percentage
– Datastore State to All Hosts >>> Is equal to / Not equal to >>> None / Connected / Disconnected


Objective 1.3 Troubleshoot Complex Storage Solutions


Perform Command-line Configuration of Multipathing Options – OCG Page 188

# Identify LUNs
esxcli storage core device list

# Identify paths
esxcli storage core path list -d device_name

# Get stats on a path
esxcli storage core path stats -d path_name

# Disable a path in the CLI
esxcli storage core path get -p path_name –state=[active/off]


Change a Multipath Policy – OCGP Page 132


# Changing the multipath policy in the GUI
VC > Host > Configuration > Hardware – Storage > Datastores View > Right click Ds, Properties > Manage Paths

# Default PSPs Explained
Most Recently Used (MRU) – VMW_PSP_MRU
– Selects path most recently used
– On failure, an alternate path will take over
– On recovery, the original path becomes an alternate

Round Robin (VMware) – VMW_PSP_RR
– automatic path selection algorithm, rotating through all active paths when connecting to active-passive arrays.

Fixed (VMware) – VMW_PSP_FIXED
– Host uses designated preferred path, if configured. Otherwise uses the first working path. An explicitly designated path will be used even if marked dead.


Troubleshoot Common Storage Issues – OCG Page 188


# Troubleshooting Storage Adapters

# Troubleshooting SSDs

# Troubleshooting Virtual SAN

# Failure to Mount NFS Datastores

# VMkernel Log Files Contain SCSI Sense Codes

System Messages – NSX Edge Services Gateway

[Back to Unofficial System Messages Guide Home]

System Messages – NSX Edge Services Gateway

System Events

CRMD – Cluster Resource Management Daemon

Appname:     cmrd 
Priority:    notice
Message:     run_graph: Transition 6431 (Complete=0, Pending=0, Fired=0, Skipped=0,
             Incomplete=0, Source=/usr/var/lib/pengine/pe-input-6430.bz2): Complete

Appname:     crmd
Priority:    info 
Message:     do_state_transition: Starting PEngine Recheck Timer

Appname:     crmd
Priority:    info
Message:     do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE
             [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]

Appname:     crmd
Priority:    info
Message:     notify_crmd: Transition 6431 status: done - 

Appname:     crmd
Priority:    info
Message:     te_graph_trigger: Transition 6431 is now complete

Appname:     crmd
Priority:    info
Message:     run_graph: ====================================================

Appname:     crmd
Priority:    info
Message:     do_te_invoke: Prinfo: Message:     ocessing graph 6431 (ref=pe_calc-
             dc-1406976970-6473) derived from /usr/var/lib/pengine/pe-input-6430.

Appname:     crmd
Priority:    info
Message:     do_te_invoke: Processing graph 6431 (ref=pe_calc-dc-1406976970-6473) 
             derived from /usr/var/lib/pengine/pe-input-6430.bz2

Appname:     crmd
Priority:    info
Message:     unpack_graph: Unpacked transition 6431: 0 actions in 0 synapses

Appname:     crmd
Priority:    info
Message:     do_state_transition: State transition S_POLICY_ENGINE -> S_TRANS
             ITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_Message origin=handle_
             response ]

Appname:     crmd
Priority:    info
Message:     do_pe_invoke_callback: Invoking the PE: query=6517, ref=pe_calc-dc-
             1406976970-6473, seq=8, quorate=1

Appname:     crmd
Priority:    info
Message:     do_pe_invoke: Query 6517: Requesting the current CIB: S_POLICY_ENGINE

Appname:     crmd
Priority:    info
Message:     do_state_transition: All 2 cluster nodes are eligible to run resources.

Appname:     crmd
Priority:    info
Message:     do_state_transition: Progressed to state S_POLICY_ENGINE after C_TIMER

Appname:     crmd
Priority:    info
Message:     do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ 
             input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ]

Appname:     crmd
Priority:    info
Message:     crm_timer_popped: PEngine Recheck Timer (I_PE_CALC) just popped!

System Messages – NSX Manager

[Back to Unofficial System Messages Guide Home]

System Messages – NSX Manager

Module vShield Edge Gateway

Code Severity Event Message:
30024 Informational Configuration changed for : [$component] on vShield Edge with id : edge-#

Module vShield Edge Health Check

Code Severity Event Message:
30033 Major VShield Edge VM not responding to health check.
30034 Informational None of the VShield Edge VMs found in serving state. There is a possibility of network disruption.
30042 Informational vShield Edge VM has recovered and now responding to health check.

Module vShield Edge Appliance

Code Severity Event Message:
30152 Informational vShield Edge system time sync up happens

Module vShield Edge HighAvailability

Code Severity Event Message:
30202 High vShield Edge HighAvailability switch over happened. VM has moved to ACTIVE state.
30203 High vShield Edge HighAvailability switch over happened. VM has moved to STANDBY state.

Module vShield Edge IPSec

Code Severity Event Message:
30401 Informational IPsec Channel from localIp : <local-endpoint-ip> to peerIp : <peer-end point-ip> changed the status to up
30402 Informational IPsec Channel from localIp : <local-subnet-ip> to peerIp : <peer-subnet-ip> changed the status to down
30403 Informational IPsec Tunnel from localSubnet : <local-subnet-ip> to peerSubnet : <peer-subnet-ip> changed the status to up
30404 Informational IPsec Tunnel from localSubnet : <local-subnet-ip> to peerSubnet : <peer-subnet-ip> changed the status to down

Best (Public) VMware NSX Learning Resources

Let me qualify the title.. I say “best” with the full authority that my opinion carries.  Just trying to give y’all a place to go to get your NSX learn on…

Digital Literature …

VMware Product Walkthroughs – NSX 

The NSX walkthrough is the perfected balance the brevity of a presentation slide-deck with involved hands-on demonstrations.  Very well put together (Check out some of the other walkthroughs).

VMware NSX Design Guide 

The design guide is a PDF~30 pages is a gentle introduction to NSX topologies.  Fundamental read if you’re still trying to get a handle on NSX concepts. 

VMware Network Virtualization Blog

Subject matter content from the experts.  Posts by Martin Casado, Bruce Davie, Brad HedlundRoger Fortier.

VMware Hands on Labs (HOL) Focus: Networking

Get acquainted with NSX Dynamic Routing, the Distributed Firewall & Load Balancing.

VMware NSX 6 Documentation Center

Nothing fancy about this one… ’tis the manuals.  NSX Install and Upgrade Guide & NSX Administration Guide.  Although in the public domain, this resource is extremely difficult (if not impossible) to find via search.  But they are in the public domain.  Whatever is public is not private…right?  

Martin Casado’s Blog – Network Heresy

Scott Lowe’s Blog – Learning NVP/NSX 

Brad Hedlund’s Blog – NSX

If videos are the way you learn …

NSX Architecture Webinar by Ivan Pepelnjak on ipspace.net

VMworld 2013 – Introducing the World to VMware NSX (By Sachin Thakkar)

VMware Interview – Bruce Davie on NSX

VMware NSX Demo

This should at the very least provide a fair start for anyone looking to mentally ramp up for the NSX NVP.

– Gabe

vShield/vCNS 5.1x CLI Operations using Expect

The vCNS(vShield) practical CLI use is limited from a configuration perspective, but you may need to interact with these from time to time.  Troubleshooting /debugging sessions/log purging come to mind.

The options for getting the job done:

1.  Interact with the vCNS Manager virtual machine console in vCenter (not great for debugging, or reading the long exception output)

2.  SSH (ssh server is enabled from the console: vsm> enable, vsm# ssh start)

Expect works well with the vtysh pseudo-terminal used for the vCNS Manager console.   I tried and failed (due to errors interacting with the terminal).   If you manage multiple vCNS environments, it makes sense to wrap the interactions into these expect scripts.  Here’s a small example expect script to change the CLI password from the default.

#!/usr/bin/expect -f
# Synop: SSH to vCNS Appliance console. Auth. Enter priv mode. Auth Enter global config. Change the 
# default password.
# SSH <vsm#ip> # enable [enter] # default [enter] # config t [enter]
# cli password %passwword> [enter] # end [enter] # wr mem
spawn ssh admin@
expect "password: "
send "default\r"
expect ">"
send "en\r"
expect "Password: "
send "default\r"
expect "#"
send "config t\r"
expect "#"
send "cli password mYn3wp@ssw0rd\r"
expect "#"
send "\r"
send "exit\r"

If your operational policy is to update your password every few months; you will find yourself revisiting a script like this.  For passing commands to multiple vCNS Managers, you can extend the script to spawn connections based on a list (outside the scope of this post).



vSphere 5.5 esxcli namespace updates

This is your 10,000 ft view of the updates to the vsphere 5.5 esxcli.  I do plan to dive in and explore the new additions in more detail at a later date.  This round I just want to provide a taste of what is new.  

For a second time in a row, and much to my delight😀, we see an emphasis on network! (esxcli 5.0 to 5.1 provided a similar treat).  There are changes aligning with some of the new features introduced in vsphere 5.5.  There’s also a significant namespace addition for VSAN.  Duncan Epping provides a great introduction to VSAN here.  Exciting times indeed.  

Without further delay…

Counts of esxcli 5.5 updates

  new commands:  74
      by namespace
      device (2)  graphics (2)  network (26)  sched (1)  storage (11)  system (8) vsan (23) 

  commands removed: 2  (vxlan network mapping replaced with arp/mac/mtep output)
  commands modified: 3  (LACP command cleanup)
  new namespaces: 3 (device, graphics, vsan)


For those who aren’t satisfied with counts, here’s the full delta of 5.1 and 5.5 below.  Enjoy!



## esxcli.new
device.alias get 
device.alias list 
graphics.device list 
graphics.vm list 
network.ip.neighbor remove 
network.ip.netstack add 
network.ip.netstack get 
network.ip.netstack list 
network.ip.netstack remove 
network.ip.netstack set 
network.nic.coalesce get 
network.nic.coalesce set 
network.nic.cso get 
network.nic.cso set 
network.nic.eeprom change 
network.nic.eeprom dump 
network.nic.negotiate restart 
network.nic.register dump 
network.nic.selftest run 
network.nic.sg get 
network.nic.sg set 
network.nic.tso get 
network.nic.tso set 
network.sriovnic.vf stats 
network.vswitch.dvs.vmware.lacp.timeout set
network.vswitch.dvs.vmware.vxlan get 
network.vswitch.dvs.vmware.vxlan.network.arp list 
network.vswitch.dvs.vmware.vxlan.network.arp reset
network.vswitch.dvs.vmware.vxlan.network.mac list 
network.vswitch.dvs.vmware.vxlan.network.mac reset 
network.vswitch.dvs.vmware.vxlan.network.mtep list 
sched.reliablemem get 
storage.nfs.param get 
storage.nfs.param set 
storage.vflash.cache get 
storage.vflash.cache list 
storage.vflash.cache.stats get 
storage.vflash.cache.stats reset 
storage.vflash.device list 
storage.vflash.module get 
storage.vflash.module list 
storage.vflash.module.stats get 
storage.vmfs unmap 
system.coredump.file add 
system.coredump.file get 
system.coredump.file list 
system.coredump.file remove 
system.coredump.file set 
system.security.certificatestore add 
system.security.certificatestore list 
system.security.certificatestore remove 
vsan.cluster get 
vsan.cluster join 
vsan.cluster leave 
vsan.cluster restore 
vsan.datastore.name get 
vsan.datastore.name set 
vsan.maintenancemode cancel 
vsan.network clear 
vsan.network.ipv4 add 
vsan.network.ipv4 remove 
vsan.network.ipv4 set 
vsan.network list 
vsan.network remove 
vsan.network restore 
vsan.policy cleardefault
vsan.policy getdefault 
vsan.policy setdefault 
vsan.storage add 
vsan.storage.automode get 
vsan.storage.automode set 
vsan.storage list 
vsan.storage remove 
vsan.trace set

## esxcli.removed
network.vswitch.dvs.vmware.vxlan.network.mapping list 
network.vswitch.dvs.vmware.vxlan.network.mapping reset

network.vswitch.dvs.vmware.lacp.get config (5.1)
network.vswitch.dvs.vmware.lacp.get stats (5.1)
network.vswitch.dvs.vmware.lacp.get status (5.1)

network.vswitch.dvs.vmware.lacp.config get (5.5)
network.vswitch.dvs.vmware.lacp.stats get (5.5)
network.vswitch.dvs.vmware.lacp.status get (5.5)