Troubleshooting Openfiler (missing NFS shares)

I came home on Friday evening to find my DLNA server wasn’t available :(.  It’s not the scenario I needed after an intense few days squeezing 5 days worth of work into a 4 day week due to the Easter bank holiday weekend, plus the 3 hour drive home.

Firstly, my DLNA server is simply Serviio running on a Xubuntu VM which mounts an NFS share containing my media files.

The virtual infrastructure in my lab that underpins it is a two node ESXi cluster with a third node running Openfiler to provide the shared storage to ESXi.  This includes a RAID 0 (not recommended I might add) iSCSI target for maximum IO within a constrained home budget and a 1TB USB HDD containing a NFS Datastore where I store my ISO’s and vm backups so as to save space on the relatively expensive, high performance iSCSI target intended for the VM’s disk files, which are also thinly provisioned to further save on space.  The Openfiler NAS also has a second 1TB USB HDD containing a second NFS Media Store share, mounted by Serviio/Xubuntu VM already mentioned (as well as any other machine in the network). The network is an 8 port, 1 GB/s managed switch with two VLANs and two Networks, one which joins the rest of the LAN, and one which just contains VMotion and iSCSI traffic.

 

So, like I said, my Serviio DLNA server was u/a and some troubleshooting was in order.

My first reaction was that something was wrong in VMWare Land, but this turned out not to be the case – however, the storage configuration tab revealed that the NFS datastores were not available, and df -h on my workstation confirmed it, so almost immediately my attention switched from VMWare to Openfiler.

Now, I won’t go into it too much here, but I’m torn with Openfiler.  The trouble is most folks would only ever interface with the web-based GUI, and they’d quickly come unstuck, since conary updateall to install all the latest updates or not, certain changes don’t seem to get written back.  I had to perform all my LVM configuration manually at the command line as root, not via the web-gui as openfiler.  I’ve yet to investigate this any further as it’s now working OK for me, but my guess would be a permissions issue.

I connected to the Openfiler web interface and could see that the shared folders (shown below) were missing, so the NFS shares were not being shared but more importantly it also implied that the logical volumes containing the filesystems exported via NFS were not mounted.  df -h on Openfiler’s command line interface confirmed this.

In order to check that Openfiler could see the hard drives at all, I issued the command fdisk -l but because the USB HDD’s are LVM physical volumes, they have gpt partition tables on them, not msdos, so fdisk does not support it, but is kind enough to recommend using GNU Parted instead.  Despite the recommendation, I used lshw > /tmp/allhardware and just used vi to go looking for the hard drive information.  The USB HDD’s are Western Digital, so I just :/WD to find them amongst the reams of hardware information, and find them I did.  Great, so the OS could see the disks, but they weren’t mounted.  I quickly checked /etc/fstab and sure enough, the devices were in there, but mount -a wasn’t fixing the problem.

Remember I mentioned that the drives had a gpt partition table, and that they were LVM physical volumes?  Well therein lies the problem.  You can’t mount a filesystem on a logical volume if the volume group that it is a part of is not activated.  Had my volume groups deactivated?  Yes, they had.

vgchange -ay /dev/vg_nfs

vgchange -ay /dev/vg_vmware

Now my volume groups were active, mount -a should work, confirmed by df -h showing that the /dev/mapper/vg_vmware-lv_vmware and /dev/mapper/vg_nfs-lv_nfs block storage devices were now mounted into /mnt/vg_vmware/lv_vmware and /mnt/vg_nfs/lv_nfs respectively.  exportfs -a should reshare the NFS shares provided the details were still in /etc/exports which they were.  Going back to the Openfiler web-interface, the shares tab now revealed the folders shown in blue (above) and their mount points needed by any NFS clients in order to mount them.  Since the mountpoint details were already in /etc/fstab on my workstation, mount -a re-mounted them and into /nfs/nfsds and /nfs/nfsms and ls -al showed that the files were all there.

rdesktop to my VirtualCenter server, mount -a in the Xubuntu terminal to remount them on the DLNA server, re-run serviio.sh and that’s it.

So that’s how I diagnosed what was wrong and how I fixed it.  Now I just need to investigate the system logs on Openfiler to see why the volume groups deactivated in the first place.  After continuous uptime without issue for 4 months, I must admit that it did come as a surprise.

 

image_pdfCreate PDF of this post...
Facebooktwittergoogle_plusredditpinterestlinkedinmail

Bulk renaming of files

This command…

for i in * ; do j=`echo $i | sed ‘s#Matt#Cyberfella#g’ – ` ; mv “$i” “$j” ; done

turns a file listing like this…

Matt_001_blah.txt

Matt_002_blah-de-blah.txt

Matt_003_blah.txt

into this…

Cyberfella_001_blah.txt

Cyberfella_002_blah-de-blah.txt

Cyberfella_003_blah.txt

Just change the search and replace strings accordingly to swap a common text string occurring in the filenames to something else (or leave the replace string blank to eradicate it).

image_pdfCreate PDF of this post...
Facebooktwittergoogle_plusredditpinterestlinkedinmail

Monitoring

How long has the system been up?

uptime

What architecture is the system?

arch

Create a list of useful configuration information for this system

sosreport –list -list plugins

e.g. bootloader  apache  kernel  hardware  memory  samba etc

Display kernel boot messages

dmesg       (/var/log/dmesg)

Display last logged on users

last -x

Read the system log (syslog)

/var/log/messages

Display all kernel information

uname -a

Display release numbers

/etc/lsb-release.d      /etc/redhat-release

Check SNMP is working if HP Systems Management is installed

snmpwalk -v1 -c community_string localhost enterprises | grep -i dl

image_pdfCreate PDF of this post...
Facebooktwittergoogle_plusredditpinterestlinkedinmail

Logical Volume Management

Logical volume management on Linux is a different way of using available disks (block devices), allowing for more flexible allocation of space on filesystems stored on one or more disks by grouping disks into volume groups, then creating logical volumes within those volume groups without boundaries.  Filesystems are created on the logical volumes which may use a portion of a disks full capacity, or the capacity of more than one disk.  The only limit is the total space in the volume group which is determined by the sum of the total physical space of all the physical block storage devices in the volume group.

You can add more devices to an existing volume group and even take them away (be careful!).  Volume groups, Logical Volumes in them, and the Filesystems on them can also be extended, resized or removed as necessary.

Scan for block devices that can be used for Logical Volume Management

lvmdiskscan

Display current physical volumes and their LVM Status

pvdisplay

Use block device /dev/hdb for LVM

pvcreate /dev/hdb

This command also takes multiple devicenames in one go, separated by spaces.

Create a volume group consisting of the physical volumes

vgcreate VolGroup01 /dev/hdb

Extend an existing volume group onto the block device just added

vgextend VolGroup00 /dev/hdb

Read the man page on vgextend for all available options.

Create a logical volume within the volume group

lvcreate -L +25G /dev/VolGroup01/LogVol00

Extend existing logical volume into the new unused space in the volume group

lvextend -L +25G /dev/VolGroup00/LogVol00

Expand the filesystem into the new free space in the logical volume.

resize2fs /dev/VolGroup00/LogVol00/

Verify the new size using df

df -kh

Display volume group information

vgdisplay VolGroup01

Display physical volume information

pvdisplay /dev/hdb

Display logical volume information

lvdisplay /dev/VolGroup01/LogVol00

Un-mount the filesystem

umount /filesystem

Check and Fix errors on the filesystem

e2fsck -f /dev/VolGroup01/LogVol00   

or  /dev/mapper/LogVol00

Resize a logical volume

lvresize -L new-size /dev/VolGroup01/LogVol00

Resize a filesystem

resize2fs /dev/mapper/LogVol00

e.g. 1500M or 44G

image_pdfCreate PDF of this post...
Facebooktwittergoogle_plusredditpinterestlinkedinmail

Disks and Partitions

List scsi devices

lsscsi

Explore the directory containing SCSI information

ls -F /proc/scsi

Display block devices on the system

ls /sys/block

Display disk devices

fdisk -l

Rescan SCSI bus without reboot

echo ” – – -” > /sys/class/scsi_host/host#/scan

Determine SCSI host value

ls /sys/class/scsi_host

image_pdfCreate PDF of this post...
Facebooktwittergoogle_plusredditpinterestlinkedinmail

Process Management Commands

Display load average, top CPU processes (use SHIFT M to sort by memory use).

top

Display a tree of all processes

pstree

Display all processes

ps -aux

Customise ps output to your requirements

ps -eo pid tid class rtprio ni pri stat comm

Find zombied processes

ps -eo pid stat | grep Z

Find processes waiting for disk

ps -eo pid stat | grep D

Trace execution of process

strace

Display free memory

free

Display virtual memory status

vmstat

Display inter-process communication status

ipcs -l or -a

Display names of slab memory in size order

slabtop

image_pdfCreate PDF of this post...
Facebooktwittergoogle_plusredditpinterestlinkedinmail

Upgrading VMWare ESXi hosts

So your ESXi environment has a few virtual machines running, and their OS’s are all kept up to date, but what about bringing the ESXi host itself up to date?  This is the quickest and easiest way I ‘ve found of getting the job done.

Download the .zip package from VMWare for your ESXi version.  This will need an internet connection.

e.g.  http://www.vmware.com/download/download.do?downloadGroup=ESXI40U3

If you don’t already have it installed, you can download and install vSphere client by typing the name or IP address of your ESXi host into your web browser.  This will not need an internet connection.

You’ll also need the VSphere CLI, which will need to be downloaded from VMWare.  This will need an internet connection.

http://www.vmware.com/download/download.do?downloadGroup=VCLI41

Should you have any installation issues, you may want to download the .NET Redistributable Package from Microsoft and pre-install that before attempting to install the VMWare products.

http://www.microsoft.com/download/en/details.aspx?id=25150

Once you have vSphere Client and vSphere CLI installed and the .zip package ready,

Connect to the VCenter Server / ESXi host and shutdown or VMotion any running virtual machines.

Place the ESXi host into Maintenance Mode.

Open vSphere CLI.

cd C:\Program Files\vmware\vmware vsphere cli\bin\perl

perl vihostupdate.pl –server esx_host_ip –username root –bundle path_to_zipfile.zip –install

Enter the root password when prompted.

It’ll go quiet for a while, but you can see that something is happening in VSphere Client.  The job will be “In Progress” for around 2 or 3 minutes on modern hardware with 1Gb/s network connectivity.  Only do one host upgrade at a time to prevent IO errors occurring which will halt the upgrade and leave locked files in /var/update/cache/ which will require a restart of the host to clear costing you time.  This is especially true if you are connected at only 100Mb/s over the network.

When you see the words “Installation Complete” in the vSphere CLI terminal, the upgrade part is complete.  Leave the host in Maintenance Mode for now and reboot it from VSphere Client.

When the host is back up, log on again using VSphere Client, and take it out of Maintenance Mode.

Thats it.  power up the VM’s, VMotion them back from the other hosts in the cluster, or just let DRS take care of it depending on your environment.

Repeat for each host in the cluster.

image_pdfCreate PDF of this post...
Facebooktwittergoogle_plusredditpinterestlinkedinmail

Disk Recovery and Forensics

Who doesn’t love the word “Forensics”?  It’s a word that brings out the inner geek in all of us, yet the reality is usually pretty grim – like when your only hard drive containing all your important files and photos fails.

The first thing you should do if you suspect your hard drive is failing or has failed is not attempt to write to it and if necessary hard shut the machine down asap by pushing and holding the power button on your PC.  Any further writes could lunch the drive for good making recovery impossible.  In otherwords STOPPP!!

Anyway, here’s some notes from recent tinkerings with Ubuntu Rescue Remix (Google it, Download it).  It’s a bootable Live CD which boots a computer into a command line only Linux environment, and for the remaining 2% who are still reading, provides you with a good handful of tools that stand you the best chance of recovering data from a failing hard disk.

Assuming you’ve just booted it and your hard disk(s) are attached, the first thing to do is identify which disk corresponds to which device name in /dev.  This can be done using lshw or fdisk -l

lshw > /tmp/hardware

cat /tmp/hardware | less

The next step is to clone the dodgy disk to either another disk, or to an image file or both.  You choose.

ddrescue /dev/sda /dev/sdc

or (restartable clone to an image file)

ddrescue –direct –retrim –max-retries=3 /dev/sda imagefile logfile

If you’ve cloned to another healthy disk, then you should fsck /dev/sdc to fix any errors, then attempt to mount it with mount /dev/sdc1 /mnt/mydisk and see if you can read any data on it.  You may be as good as done at this point with no further need to go on to employing other more targeted tools for recovering data off an unmountable drive.  Failing that, try to stay calm (really – it helps), clone the disk to an imagefile the best you can, then read on.  If you can’t stay calm, then run testdisk and benefit from a more intuitive menu driven interface of various recovery options.

testdisk

Or if you’re enjoying this new found challenge of getting the photos back before the missus finds out, read on about using foremost and other similar, powerful recovery commands.

sudo foremost -i imagefile -o /recovery/foremost -w       (list recoverable files only)

sudo foremost -i imagefile -o /recovery/foremost -t jpg           (recover jpg files only)

If you suspect that the partitioning information on the drive is gone, then you can replace it using gpart to guess what the previous partitioning scheme was based upon whats on the drive.  This is good if you’re an overzealous techy who blanked the drive to install the latest OS without thinking about who else had an account on the computer and what they may have had stored.  Not good.  Don’t do it again.

sudo gpart /dev/sda

Or instead of using foremost, you could try scalpel.  Like foremost, but configurable and well, a bit better.

vi /etc/scalpel/scalpel.conf     (to configure options)

sudo scalpel imagefile -o /recovery/scalpel/

Or maybe try magicrescue on the cloned disk if there’s multiple file types to be recovered (requires the presence of recipes for the filetypes to be recovered).

/usr/share/magicrescue/recipes

Enable DMA on the cloned disk first to speed things up.

hdparm -d 1 -c 1 -u 1 /dev/hdc

sudo magicrescue  -r gzip -r png  -d /recovery/magicrescue /dev/sdc

If it’s specificly photos you’re wanting to recover, then there are two tools to choose from; photorec and recoverjpeg.

sudo photorec imagefile         (imagefile is the disk imagefile, not an image as in picture)

sudo recoverjpeg /dev/sdc1       (recovers any obvious jpeg files on partition /dev/sdc1)

If the files you want to recover were deleted on the original drive, then assuming the drive has come from a windows computer and was formatted with NTFS, then you can use ntfsundelete to recover the deleted files.

ntfsundelete -s /dev/sdc1     (scans for inodes of deleted files which can be subsequently recovered)

ntfsundelete /dev/sdc1 -u -i 3689 -o work.doc -d /recovered/ntfsundelete

If you want to recover old files previously written to a disk containing a new FAT filesystem, then you’re into using autopsy and dls, fls, icat and sorter from sleuthkit to create a secondary image of unallocated blocks contained in the image and list the inodes of files apparently contained within them, recover those files and optionally sort them by filetype, respectively.

sudo autopsy -d /media/disk/autopsy 192.168.0.1      (use your local ip address)

dls imagefile > imagefile_deletedblocks        (create secondary, smaller imagefile)

fls imagefile_deletedblocks -r -f fat -i raw      (list inode numbers of any deleted files found)

icat -r -f fat -i raw imagefile_deletedblocks inode_number > myfile.doc    (recover a file)

sudo sorter -h -s -i raw -f fat -d out -C /usr/share/sleuthkit/windows.sort /imagefile

This just touches upon ways you can recover lost data, with a few useful examples, but remember each command in it’s own right has a multitude of options which can be perused using the man command and reading the accompanying manual.  You can also google man sorter for example, and read the man page in a web browser.  I hope you get some data back!

image_pdfCreate PDF of this post...
Facebooktwittergoogle_plusredditpinterestlinkedinmail