Teaming Management NICs

The vmware esxi hypervisor with multiple nics can be configured a multitude of ways depending on the number of nics on board.

My lab hypervisors only have two, but that is enough to present a choice in itself, between splitting management, vmotion and iscsi traffic or alternatively teaming the two nics and putting all vmkernel ports, storage adapters and management traffic over a common active-active bonded link.

The lab environment has been running flawlessly for months with a physical split configured between management and vmotion/iscsi networks so I thought I’d configure up the “alternative” scenario and let that run to see how things go.

One thing to look out for when reconfiguring the networking on the ESXi hosts (apart from making sure all names of vmkernel ports match perfectly like before) is that the physical nics are both active afterwards.

 

Note one nic is in standby.

One of mine did it automatically, the other one didn’t.  This left me in an unforeseen situation whereby I wouldn’t have been getting the full bandwidth benefit of both nics on one of my hosts while attempting to run everything over a single nic.  This is definitely not recommended although in test, vmotion was still rapid -most likely due to very little else going on.

This would not be the case in a production environment and I’d certainly recommend migrating all your guests from any host that is being reconfigured and put it into maintenance mode.  I didn’t do either of these things but that said, the whole point is to push my lab to breaking point and document the experience – which is what happened.  More on that later.

 

Click on Move up, to make the second vmnic active.

With both nics active, you should see the following…

…both nics become active.

This change will possibly also require you to connect to the local console of each esxi host and manually restart the management network.  This is certainly the case for earlier ESXi 4.0.0.

Upon removing the second vSwitch my ESXi host lost connection to the iSCSI datastore and thus the virtualcentre vm’s hard disk etc.  Ordinarily this would not be a problem since in a clustered environment the other ESXi host would restart the guest, however the network configuration was in mid-change and thus did not match on both hosts in the cluster.  This is called “Proper breaking it” from where I’m from but is where the real learning happens.  Let it be in your back bedroom though and not on the datacentre floor.  To recover from the situation I first attempted to shut down the vm using the unsupported console (covered in an earlier post), which the esxi host said was still powered on.  It did not want to power off, or power on, or reset.  In fact the esxi host didn’t want to reboot either so it got a hard reset in the form of me pushing in and holding in the power button and wondering if I’d have to build an out of band vm to install vsphere client on so that I could complete the network configuration of the esxi host.

After reboot, I’d noticed that the cluster had restarted the guests including the virtualcentre server on the other healthy host which I thought was pretty impressive since it saved me a bunch of hassle.  This enabled me to continue reconfiguring the new vmotion vmkernel port on the bonded nics.  A quick check over suggested everything was consistent except I’d lost visibility of the iscsi target.  A quick rescan and it re-appeared and a successful vmotion of my DLNA server in mid-flight proved it was all healthy again.  I’ll see how well it behaves unattended over the next few weeks / months.  I’d like to know how I might force a rescan down the virtual iscsi storage adapter for the iscsi target where the datastore resides from the unsupported console though.  I wouldn’t be surprised to find out it can’t be done, in which case I’d find myself installing vsphere client on another machine and doing it using the gui.

Facebooktwittergoogle_plusredditpinterestlinkedinmail

Setting a Round-Robin Fibre Channel Path Policy on ESXi

If you’re DataStores are using Fibre channel storage and you have multple fchba’s connected through to the SAN via a fc switch or two, then it is prudent to optimise the IO potential of all the redundant hardware by changing the fc path policy to “round robin”.

Using vsphere client, connect to the ESXi host / vcentre server

Inventory, Hosts and Clusters, select ESXi host

Configuration tab, Storage, highlight the datastore, click Properties

Click Manage Paths button on the DataStore properties dialog

Change Path Selection to Round Robin (VMWARE) – the default is Most Recently Used (VMWARE)

Wait for the screen to reload, Click Close

Repeat for each DataStore, and then Repeat for each ESXi host.

 

Facebooktwittergoogle_plusredditpinterestlinkedinmail

vi Reference

Anybody can google the answer right?  Correct.  However, not everybody can then apply the solution – especially if it involves editing text files from the command line.  Cue The vi Editor.

Before you attempt to modify a file with vi, take a copy of the file so you have something to fall back on when you get it 1. horribly wrong, then 2. subconsciously quit with :wq! subsequently writing your wrongs back to disk.  D’oh!

 

Navigation

Basic editing                                                   

Esc       Switch to Command Mode

a          Append after cursor

i           Insert before cursor

R          Overtype

u          Undo (maintains history)

x           Delete character under cursor

O          Open a new line

 

Display settings

:set ic               turn search case sensitivity off

:set noic            turn search case sensitivity on

:set nu              turn line numbering on

:set nonu           turn off line numbers

 

Cut, Copy and Paste                                      

dw        Cut whole word

dd         Cut whole line

cw        Change word

4dd       Cut four lines

d4w      Cut four words

yy         Yank (Copy) whole line

y$         Yank from cursor to end of line

y3w      Yank three words

3yy       Yank three lines

p          Paste after cursor

cc         Change whole line

c4l        Change next 4 chars

c4w      Change next 4 words

c$         Change from cursor to end of line

c0         Change from cursor to beginning of line

 

Searching and Replacing                                

/word    find “word” (forwards)

?word   find “word” (backwards)

n          goto next match of “word”

N          goto previous match of word

:s/dog/cat/gi                             find and replace all dogs with cats on this line only, ignoring case

:%s /dog/cat/g                          find dog and replace with with cat on all lines (gl0bally).

:g/mywrod/s//myword/g find ‘mywrod’ and replace it with ‘myword’

:g/matt/s/fooobar/foobar/g         find ‘matt’ and replace ‘fooobar’ with ‘foobar’ on those lines.

 

Saving, Loading and Quitting

Note: hit Esc to enter Command Mode first…

:w        save with current filename

:wq       save and quit

:q         quit

:q!        forcibly quit

:wq!      forcibly write and quit

:r <filename>    read <filename>

 

Setting up vi

On UNIX edit the .exrc file in your home dir…  smd showmatch ic wrapmargin=0 report=1

If your Linux system uses vim instead of vi, then edit .vimrc, not .exrc to get the same result, though in vim it’s probably already set up nicely to start with.

Add syn on in .vimrc to set syntax highlighting on (nice).  Also, set cindent, set autoindent and nu for indentation and line numbering if you want that too.

 

Facebooktwittergoogle_plusredditpinterestlinkedinmail

Automatic startup of ESXi Guests

It’s not immediately obvious where you configure vm’s to startup automatically when the esxi host starts.

In Hosts and Clusters view (in Virtual Centre), click on the ESXi host – that’s HOST, not GUEST, i.e. the machine running vmware esxi, not the virtual machine itself.

On the Configuration tab, in Software settings, select Virtual Machine Startup/Shutdown.

By default, automatic startup of virtual machines is disabled, so you need to enable it before you can move the vms upwards into Automatic Startup.  Click Properties in the top left hand corner and tick

Allow Virtual Machines to start and stop automatically with the system.

Select your DC and/or VC vm and move it all the way up so it exists in the Automatic Startup section of the Startup Order Dialog box.

Apply a delay if you want to, but don’t choose a value less than 90 seconds.

Facebooktwittergoogle_plusredditpinterestlinkedinmail

Resetting Cisco UCS KVM

From experience, it’s not uncommon to not be able to connect to the KVM of a cisco ucs blade.   Instead of seeing a Remote console screen, you’ll receive a “Connect failed” or “Request Shared Session” message, with no means of getting to the console.


Within the Service Profile, click on the Server Details tab. From there, click on Recover Server. Select “Reset CIMC (Server Controller)”.  Choose Reset KVM Controller.  This will kill existing KVM sessions and allow you to start a new session. Resetting the CIMC does not affect data traffic to/from the server NICs (ethernet and HBAs).

Another thing to check in Servers tab, General tab is the Management IP Address setting.  If it’s configured to take an address from a pool, check the pool in the Admin tab, Management IP Pool, IP Addresses tab to see what IP’s exist in the range, and whats been assigned.

If a reset hasn’t worked, In the Servers tab, General tab, Management IP Address section, change the IP address from Pooled to Static.  Use an IP address from the other end of the range in the pool.  Click Save Changes, and try connecting to the KVM again.

Facebooktwittergoogle_plusredditpinterestlinkedinmail

Hardening VMware Guests (VMs)

The guest vm needs to be shutdown.

Remove any superfluous hardware such as cdrom drives and floppy drives and usb.

In the Inventory panel, right click the virtual machine, Settings, Options, Advanced, General

Click Configuration Parameters button

Add the following lines to the guests vmx file…

isolation.tools.diskWiper.disable=true

isolation.tools.diskShrink.disable=true

isolation.device.connectable.disable=true

isolation.device.edit.disable=true

log.rotateSize=1000000

log.keepOld=10

remoteDisplay.maxConnections=1

 

Facebooktwittergoogle_plusredditpinterestlinkedinmail

Xubuntu 64 bit vs Crunchbang 64 bit

My recent purchase of a Lenovo IdeaCentre Q180 has proved to be interesting.  Hey, Lenovo, your choice of a Radeon graphics chipset was a poor one.  I think.

64 bit Linux has always interested me.  The real UNIXes like HPUX and AIX are 64 bit and rock solid number crunching beasts, so the prospect of running 64bit UNIX* for free* on my every day machine (without the cost of purchasing a Mac) has always interested me.  The trouble is, 64bit Linux has a checkered past on the desktop with showstopping issues being graphics driver support, flash plugin, and support for scanning and printing, i.e. all the things that a 64bit UNIX number crunching server would never have to worry about.

Despite this, I figure things must have moved on a bit by now, especially with so many inexpensive 64bit CPU’s gracing the systemboards of most modern machines so I’d give it another go.  The first thing to go was the 32bit installation of Xubuntu 11.10 on my 11.6″ Dell Inspiron 11z laptop – a trusty servant and a faultless OS, in favour of trialling 64bit Cruncbang Statler – and not the BPM (backported modules) one on Linux kernel 3, but the more stable, stoical 2.6 Linux kernel.  This is 64 bit desktop OS territory so stability is important, and buggy, bleeding edge software modules are not welcome here.  Not on my machine anyway.

The Lenovo Ideacentre Q180 didn’t come with an OS – an attractive proposition, not paying for an unwanted Microsoft license, and one which helped seal the deal if I’m honest.  I installed 32bit Xubuntu with all my usual post-install customisations, i.e. adding the Medibuntu repository, installing recommended hardware drivers and a full apt-get update && apt-get upgrade and reboot, and finally adding Ad Block Plus and DownThemAll plugins to firefox, and installing Ubuntu One and Dropbox to re-sync all my important stuff stored in the cloud.  After that, for me, apps are just apt-get installed on demand as and when I need them, such as the wonderfully convenient gscan2pdf for scanning receipts and saving them as a pdf in the cloud for safe keeping.  I’m not spending hours trying to think of all the software I need and installing it before I need it.  Life’s too short.  Go do something else instead.  If I want to rip and re-encode a DVD to divx, I’ll just apt-get install dvdrip rar libdvdcss2 as and when I need to.  Not that I’d ever want to do that of course.  I digress.

Once I’d verified that xubuntu 32 ran OK on the hardware, I blew it away in favour of trialling 64bit Xubuntu.  My initial tests with Crunchbang 64 on the laptop were proving to be very very successful indeed.  It’s been on there a couple weeks now, and I have no intention of replacing it anytime soon.  So a win for Crunchbang.  Yay.

I had to download and use unetbootin to create a bootable live usb stick from the downloaded .iso image since the startup disk creator packaged with the OS just didn’t like booting on the Lenovo.  Not to worry.  unetbootin worked a treat.  Installation went without a hitch and hardware drivers installed etc as per the normal routine detailed above.  I was all set to feel that “new vanilla OS” warm comfortable feeling experienced by a graffiti artist when they spot a white wall, or a surfer when turns up at an empty break, when all of a sudden the user interface started to play up.  Frown time.

After some research I quickly realise there are issues with Radeon drivers and Linux full-stop, let alone on 64 bit linux, despite ATI’s claim that their driver supports both 32 bit and 64bit Linux.  It also claims to support RedHat and Suse if you look closely enough, which left me wondering about the “automatic” install on Ubuntu I’d just done.

I’ve tried a number of things but the graphics card driver is definitely problematic.  So it’s going to be Crunchbang 64 on the off chance it’s OK, with a post-install of XFCE for a slightly more user friendly experience (seeing as how Gnome has gone right off the rails since 3.0).  Failing that, I’ll be going back to the more tried and true 32 bit distributions in the hope that the graphics driver behaves itself better.  Reliability is king.  Having a 640 horsepower supercar is no good if it breaks down.   You’re better off with a 320 horsepower Evo.

Facebooktwittergoogle_plusredditpinterestlinkedinmail

Linux Broadcom Wifi problems

I’ve seen a few issues lately with some of the more modern linux distro’s and connecting to wireless networks with a Broadcom wifi adapter (usually with built in bluetooth).

I couldn’t fix it on xubuntu 12.04 (slow transfer speeds) and now I think I know why, so I’ll have to go back and check.

On Crunchbang Statler 64bit, this seems to have worked…

Note my kernel version, model of wi-fi card and the two commands at the bottom that actually did the magic – removing the b43 module from the kernel and then reloading it with pio enabled and qos disabled.

Now I can connect to the pub’s wifi and, well, blog this.  🙂

If it works for you too, you can make the changes permanent like this…

sudo touch /etc/modprobe.d/b43.conf 

echo "options b43 pio=1 qos=0" | sudo tee -a /etc/modprobe.d/b43.conf
Facebooktwittergoogle_plusredditpinterestlinkedinmail

Networking on Red Hat Enterprise Linux

The following post is an attempt at covering Linux Network Configuration end-to-end to a “bit better than reasonable level”.  The brevity of the post is by design since it is the sort of post that is mostly referred to as a reference or quick lookup guide to remind me, and others, of the name of that file, or that command that does…

As much as I love UNIX and Linux, since everything is a command or a file, the downside of that is the requirement of the knowledge up front to a certain extent (largely alleviated by Google these days) and in terms of the command line, is not that intuitive, even with the help of man pages.

Sometimes you just need to look something up that you know you’ve done before, but it was a few months ago or a year or two ago and you just need that post to point you back in the right direction.

 

You can configure a NIC on the fly with

ifconfig eth0 ip-address netmask subnet-mask

The permanent configuration that will be read at boot time or when the /etc/init.d/network restart occurs is held in /etc/sysconfig/network-scripts/ifcfg-eth0 etc

If you need to write a config file from scratch, use this as a template/guide

DEVICE=eth0

BOOTPROTO=static

IPADDR=ip-address

NETMASK=subnet-mask

HWADDR=pre-populated-MAC-address

ONBOOT=yes

USERCTL=no

MTU=1500

TYPE=Ethernet

ETHTOOL_OPTS=”

When you’re done, restart networking

/etc/init.d/network restart

and check they all come up.  If not, recheck the ifcfg-eth files in /etc/sysconfig/network-scripts, paying attention to the ONBOOT=yes line.

To test which of your physical nics corresponds to the linux os network device, disconnect a cable and use

ethtool eth0

paying attention to the bottom line which reads “link detected – YES” or “link detected – NO”

If there is a PCI NIC in the system, RHEL may assign it’s ports eth0 and eth1 taking priority over the embedded nics on the system board.  This is generally not an expected behaviour if you’re new to it.

check all network configurations with

ifconfig -a | less

check the DNS addresses are populated in /etc/resolv.conf and perform an nslookup to verify network connectivity as ping packets are often dropped by firewalls.

Setting a default gateway

You can configure a default gateway in /etc/sysconfig/network

e.g. Add the line

GATEWAY=<ip-of-default-router>

Speed and Duplex setting can be viewed using

ethtool eth1

and

dmesg | grep -i duplex

or using mii-tool

Display all active TCP ports along with process ID and name using the port

netstat -atp

Display routing table in numeric form

netstat -r -nr

Display all netstat statistics

netstat -as

List open files that are network related

lsof -i

MAC Address to Device listing

arp -v

Look for connected interfaces “link detected  -yes”

ethtool eth0

Display run levels where networking starts

chkconfig network –list

Display network status

/etc/init.d/network status   or  /sbin/service/network status

Display all network device configuration

ifconfig -a

Useful files where networking configuration is stored

    /etc/hosts       -will overrride other forms of name resolution contained in /etc/nsswitch.conf

/etc/resolv.conf       -contains the IP addresses of DNS servers used for name resolution in TCP/IP networks.

/etc/nsswitch.conf       -controls the order that names are resolved to IP addresses, i.e. files, nis, dns

/etc/sysconfig/network-scripts/ifcfg-eth0

Display interfaces and metrics

netstat -i

Create an SSH tunnel of port 2381 (hpsmh) on remote host to local port (use 1025 up)

ssh -f username@ip_address -L 1025:ip_address:2381 -N

i.e. browsing to http://localhost:1025 is the same as http://remotehost:2381

 Troubleshooting a NIC

Below is an example of a busy backup network interface on a backup server.  Note how its dropping packets etc.

eth4      Link encap:Ethernet  HWaddr 10:1F:74:8B:8F:8X

          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1

          RX packets:22053199483 errors:40041 dropped:18775 overruns:46 frame:0

          TX packets:8811133044 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:1000

          RX bytes:31314447740529 (28.4 TiB)  TX bytes:6356693939792 (5.7 TiB)

          Memory:fbec0000-fbee0000

 

Possible Causes of Ethernet Errors

Collisions: Signifies when the NIC card detects itself and another server on the LAN attempting data transmissions at the same time. Collisions can be expected as a normal part of Ethernet operation and are typically below 0.1% of all frames sent. Higher error rates are likely to be caused by faulty NIC cards or poorly terminated cables.

Single Collisions: The Ethernet frame went through after only one collision

Multiple Collisions: The NIC had to attempt multiple times before successfully sending the frame due to collisions.

CRC Errors: Frames were sent but were corrupted in transit. The presence of CRC errors, but not many collisions usually is an indication of electrical noise. Make sure that you are using the correct type of cable, that the cabling is undamaged and that the connectors are securely fastened.

Frame Errors: An incorrect CRC and a non-integer number of bytes are received. This is usually the result of collisions or a bad Ethernet device.

FIFO and Overrun Errors: The number of times that the NIC was unable of handing data to its memory buffers because the data rate the capabilities of the hardware. This is usually a sign of excessive traffic.

Length Errors: The received frame length was less than or exceeded the Ethernet standard. This is most frequently due to incompatible duplex settings.

Carrier Errors: Errors are caused by the NIC card losing its link connection to the hub or switch. Check for faulty cabling or faulty interfaces on the NIC and networking equipment.

 

Facebooktwittergoogle_plusredditpinterestlinkedinmail

Booting an in-band VirtualCentre Server VM from the ESXi console

If your VirtualCentre server is itself a VM, then it’ll be running on an ESXi host.  In the event that the ESXi host is restarted without vMotioning the VirtualCenter Server first (such as when the management network is irrecoveraby unresponsive), then depending on your environment, you may not be able to get a remote connection to the vm after the host has restarted.  In this scenario, you’d need to be able to boot the VM from the unsupported console.  This is how to do it.

Connect to the iLo or equivalent management interface to the ESX host, send an Alt-F1 and type unsupported followed by the root password to obtain a prompt on the unsupported console.

Identify the VM’s resident on the host

vim-cmd vmsvc/getallvms

Identify the current power state of the vm running virtual centre

vim-cmd vmsvc/power.getstate ##           where ## is the number of the vm identified above

Power on the vm

vim-cmd vmsvc/power.on ##                       where ## is the number of the vm identified above

Facebooktwittergoogle_plusredditpinterestlinkedinmail