The vmware esxi hypervisor with multiple nics can be configured a multitude of ways depending on the number of nics on board.
My lab hypervisors only have two, but that is enough to present a choice in itself, between splitting management, vmotion and iscsi traffic or alternatively teaming the two nics and putting all vmkernel ports, storage adapters and management traffic over a common active-active bonded link.
The lab environment has been running flawlessly for months with a physical split configured between management and vmotion/iscsi networks so I thought I’d configure up the “alternative” scenario and let that run to see how things go.
One thing to look out for when reconfiguring the networking on the ESXi hosts (apart from making sure all names of vmkernel ports match perfectly like before) is that the physical nics are both active afterwards.
Note one nic is in standby.
One of mine did it automatically, the other one didn’t. This left me in an unforeseen situation whereby I wouldn’t have been getting the full bandwidth benefit of both nics on one of my hosts while attempting to run everything over a single nic. This is definitely not recommended although in test, vmotion was still rapid -most likely due to very little else going on.
This would not be the case in a production environment and I’d certainly recommend migrating all your guests from any host that is being reconfigured and put it into maintenance mode. I didn’t do either of these things but that said, the whole point is to push my lab to breaking point and document the experience – which is what happened. More on that later.
Click on Move up, to make the second vmnic active.
With both nics active, you should see the following…
…both nics become active.
This change will possibly also require you to connect to the local console of each esxi host and manually restart the management network. This is certainly the case for earlier ESXi 4.0.0.
Upon removing the second vSwitch my ESXi host lost connection to the iSCSI datastore and thus the virtualcentre vm’s hard disk etc. Ordinarily this would not be a problem since in a clustered environment the other ESXi host would restart the guest, however the network configuration was in mid-change and thus did not match on both hosts in the cluster. This is called “Proper breaking it” from where I’m from but is where the real learning happens. Let it be in your back bedroom though and not on the datacentre floor. To recover from the situation I first attempted to shut down the vm using the unsupported console (covered in an earlier post), which the esxi host said was still powered on. It did not want to power off, or power on, or reset. In fact the esxi host didn’t want to reboot either so it got a hard reset in the form of me pushing in and holding in the power button and wondering if I’d have to build an out of band vm to install vsphere client on so that I could complete the network configuration of the esxi host.
After reboot, I’d noticed that the cluster had restarted the guests including the virtualcentre server on the other healthy host which I thought was pretty impressive since it saved me a bunch of hassle. This enabled me to continue reconfiguring the new vmotion vmkernel port on the bonded nics. A quick check over suggested everything was consistent except I’d lost visibility of the iscsi target. A quick rescan and it re-appeared and a successful vmotion of my DLNA server in mid-flight proved it was all healthy again. I’ll see how well it behaves unattended over the next few weeks / months. I’d like to know how I might force a rescan down the virtual iscsi storage adapter for the iscsi target where the datastore resides from the unsupported console though. I wouldn’t be surprised to find out it can’t be done, in which case I’d find myself installing vsphere client on another machine and doing it using the gui.