Networker Cheatsheet

Here is a handy cheat sheet in troubleshooting failing backups and recoveries using Dell/EMC Networker.  All content here is taken from real-world experience (and is regularly updated).

Backup Architecture and things to check (at a glance)

Is backup server running? 

Check the uptime and that the daemon log is being written to.

nsrwatch -s backupserver    -Gives a console version of the NMC monitor
cp /nsr/logs/daemon.raw ~/copyofdaemon.raw

nsr_render_log -l ~/copyofdaemon.raw > ~/copyofdaemon.log

tail -10 ~/copyofdaemon.log

You may find mminfo and nsradmin commands are unsuccessful.  The media database may be unavailable and/or you may receive “program not registered” error that usually implies the Networker daemons/services are not running on the server/client.  This can also occur during busy times such as clone groups running (even though this busy-ness is not reflected in the load averages on the backup server.

Client config.

Can you ping the client / resolve the hostname or telnet to 7937?

Are the static routes configured (if necessary).

Can the client resolve the hostnames for the backup interfaces? have connectivity to them?

Does the backup server appear in the nsr/res/servers file?

Can you run this on the client?

save -d3 -s /etc

From the backup server (CLI)…

nsradmin -p 390113 -s client

Note:  If the name field is incorrect according to nsradmin (happens when machines are re-commissioned without being rebuilt) then you need to stop nsrexecd, rename /nsr/nsrladb folder to /nsr/nsrladb.old, restart nsrexecd, and most importantly, delete and recreate the client on the networker backup server, before retrying the following command:

savegrp -vc client_name group_name

Also check that all interface names are in the servers file for all interfaces on all backup servers and storage nodes likely to back the client up.

Can you probe the client?

savegrp -pvc client groupname

savegrp -D2 -pc client groupname (more verbose)

Bulk import of clients

Instead of adding clients manually one at a time in the NMC, you can perform an initial bulk import.

nsradmin -i bulk-import-file

where the bulk-import-file contains many lines like this

create type: NSR Client;name:w2k8r2;comment:SOME COMMENT;aliases:w2k8r2,w2k8r2-b,w2k8r2.cyberfella.co.uk;browse policy:Six Weeks;retention policy:Six Weeks;group:zzmb-Realign-1;server network interface:backupsvrb1;storage nodes:storagenode1b1;

Use excel to form a large csv, then use Notepad++ to remove commas.  Be aware there is a comma in the aliases field, so use an alternative character in excel to represent this then replace it with a comma once all commas have been removed from the csv.

Add user to admin list on bu server

nsraddadmin -u user=username, host=*     

where username is the username minus the domain name prefix (not necessary).

Reset NMC Password (Windows)

The default administrator password is administrator.  If that doesn’t work, check to see that the GST service is started using a local system account (it is by default), then in Computer Management, Properties, Advanced Properties, create a System Environment Variable;

 GST_RESET_PW=1

Stop and start the GST Service and attempt to logon to the NMC using the default username and password pair above.

When done, set

GST_RESET_PW=<null>

Starting a Backup / Group from the command line

On the backup server itself:  

savegrp -D5 -G <group_name>

Ignore the index save sets if you are just testing a group by adding  -I

Just backing up the :index savesets in a group:

savegrp -O -G <group_name>

On a client:

save -s <backup_server_backupnic_name> <path>

Reporting with mminfo

List names of all clients backed up over the last 2 weeks (list all clients)

mminfo -q "savetime>2 weeks ago" -r 'client' | sort | uniq

mminfo -q 'client=client-name, level=full' -r 'client,savetime,ssid,name,totalsize'

in a script with a variable, use double quotes so that the variable gets evaluated, and to sort on american date column…

mminfo -q "client=${clientname},level=full" -r 'client,savetime,ssid,level,volume' | sort -k 2.7,2.10n -k 2.1,2.5n -k 2.4,2.5n

mminfo -ot -c client -q "savetime>2 weeks ago"

mminfo -r "ssid,name,totalsize,savetime(16),volume" -q "client=client_name,savetime >10/01/2012,savetime <10/16/2012"

List the last full backup ssid’s for subsequent use with recover command (unix clients)

mminfo -q 'client=server1,level=full' -r 'client,savetime,ssid'

Is the client configured properly in the NMC? (see diagram above  for hints on what to check in what tabs)

How many files were backed up in each saveset (useful for counting files on a NetApp which is slow using the find command at host level)

sudo mminfo -ot -q 'client=mynetappfiler,level=full,savetime<7 days ago' -r 'name,nfiles'
name                    nfiles

/my_big_volume          894084

You should probably make use of the ssflags option in the mminfo report too, which adds an extra column regarding the status of the saveset displaying one or more of the following characters CvrENiRPKIFk with the common fields shown in bold below along with their meanings.

C Continued, v valid, r purged, E eligible for recycling, N NDMP generated, i incomplete, R raw, P snapshot, K cover, I in progress, F finished, k checkpoint restart enabled.

Check Client Index

nsrck -L7 clientname

Backing up Virtual Machines using Networker,VCentre and VADP

To back up virtual machine disk files on vmfs volumes at the vmware level (as opposed to the individual file level backups of the individual vm’s), networker can interface with the vcenter servers to discover what vm’s reside on the esxi clusters managed by them, and their locations on the vmfs shared lun.  For this to work, the shared lun’s also need to be presented/visible to the VADP Proxy (Windows server with Networker client and/or Server running as a storage node) in the fc switch fabric zone config.

The communication occurs as shown in blue.  i.e.

The backup server starts backup group containing vadp clients.

The vadp proxy asks vcentre what physical esxi host has the vm, and where the files reside on the shared storage luns.

The vadp proxy / networker storage node then tells the esxi host to maintain a snapshot of the vm while the vmdk files are locked for backup.

the vmdk files are written to the storage device (in my example, a data domain dedup device)

when the backup is complete, the client index is updated on the backup server, and the changes logged by the snapshot are applied to the now unlocked vmdk and then the snapshot is deleted on the esxi host.

Configuring Networker for VADP Backups via a VADP Proxy Storage Node

The VADP Proxy is just a storage node with fibre connectivity to the SAN and access to the ESXi DataStore LUNs.

In Networker, right click Virtualisation, Enable Auto Discovery

VADP-enable

Complete the fields, but notice there is an Advanced tab.  This is to be completed as follows…  not necessarily like you’d expect…

vadp-advanced

Note that the Command Host is the name of the VADP Proxy, NOT the name of the Virtual Center Server.

Finally, Run Auto Discovery.  A map of the infrastructure should build in the Networker GUI

vadp-gui

Ensure vc, proxy and networker servers all have network comms and can resolve each others names.

You should now be ready to configure a VADP client.

Configuring a VADP client (Checklist)

GENERAL TAB

vadp-client-general

IDENTITY
COMMENT
application_name – VADP
VIRTUALIZATION
VIRTUAL CLIENT
(TICK)
PHYSICAL HOST
client_name
BACKUP
DIRECTIVE
VCB DIRECTIVE
SAVE SET
*FULL*
SCHEDULE
Daily Full

APPS AND MODULES TAB

vadp-client-appsmods

BACKUP
BACKUP COMMAND
nsrvadp_save -D9
APPLICATION INFORMATION
VADP_HYPERVISOR=fqdn_of_vcenter (hostname in caps)
VADP_VM_NAME=hostname_of_vm (in caps)
VADP_TRANSPORT_MODE=san
DEDUPLICATION
Data Domain Backup
PROXY BACKUP
VMWare
hostname_of_vadp_proxy:hostname_of_vcenter.fqdn(VADP)

GLOBALS 1 OF 2 TAB
ALIASES
hostname
        hostname.fqdn
        hostname_backup
        hostname_backup.fqdn
        ip_front
        ip_back

GLOBALS 2 OF 2 TAB
REMOTE ACCESS
user=svc_vvadpb,host=hostname_vadp_proxy
        user=SYSTEM,host=hostname_vadp_proxy
        *@*

OWNER NOTIFICATION
  /bin/mail -s “client completion : hostname_client” nwmonmail

Recovery using recover on the backup client

sudo recover -s backup_server_backup_interface_name

Once in recover, you can cd into any directory irrespective of permissions on the file system.

Redirected Client Recovery using the command line of the backup server.

Initiate the recover program on the backup server…
sudo recover -s busvr_interface -c client_name -iR -R client_name

or use…  -iN (No Overwrite / Discard)
-iY (Overwrite)

-iR (Rename ~ )

Using recover> console

Navigate around the index of recoverable files just like a UNIX filesystem

Recover>    ls    pwd cd\

Change Browsetime
Recover>    changetime yesterday
1 Nov 2012 11:30:00 PM GMT

Show versions of a folder or filename backed up
Recover>      versions     (defaults to current folder)
Recover>    versions myfile

Add a file to be recovered to the “list” of files to be recovered
Recover>    add
Recover>     add myfile

List the marked files in the “list” to be recovered
Recover>    list

Show the names of the volumes where the data resides
Recover>    volumes

Relocate recovered data to another folder
Recover>    relocate /nsr/tmp/myrecoveredfiles

Recover>  relocate “E:\\Recovered_Files”     (for Redirected Windows Client Recovery from Linux Svr)

View the folder where the recovered files will be recovered to
Recover>    destination

Start Recovery
Recover>    recover

SQL Server Recovery (database copy) on a SQL Cluster

First, rdc to cluster name and run command prompt as admin on cluster name (not cluster node)
nsrsqlrc -s <bkp-server-name> -d MSSQL:CopyOfMyDatabase -A <sql cluster name> -C MyDatabase_Data=R:\MSSQL10_50.MSSQLSERvER\MSSQL\Data\CopyOfMyDatabase.mdf,MyDatabase_log=R:\MSSQL_10_50\MSSQLSERVER\MSSQL\Data\CopyOfMyDatabase.ldf MSSQL:MyDatabase

Delete the NSR Peer Information of the NetWorker Server on the client/storage node.

Please follow the steps given below to delete the NSR peer information on NetWorker Server and on the Client.

1. At NetWorker server command line, go to the location /nsr/res

2. Type the command:

nsradmin -p nsrexec
print type:nsr peer information; name:client_name
delete
y

Delete the NSR Peer Information for the client/storage node from the NetWorker Server.

Specify the name of the client/storage node in the place of client_name.

1. At the client/storage node command line, go to the location /nsr/res

2. Type the command:

nsradmin -p nsrexec
print type:nsr peer information
delete

y

VADP Recovery using command line

Prereqs to a successful VADP restore are that the virtual machine be removed from the Inventory in VCenter (right click vm, remove from Inventory), and the folder containing the virtual machines files in the vmware datastore be renamed or removed. If the vm still exists in vmware or in the datastore, VADP will not recover it.

Log onto the backup server over ssh and obtain the save set ID for your VADP “FULLVM” backup.

mminfo –avot –q “name=FULLVM,level=full”

Make a note of the SSID for the vm/backup client (or copy it to the cut/paste buffer)

e.g. 1021210946

Log onto the VADP Proxy (which has SAN connectivity over fibre necessary to recover the files back to the datastore using the san VADP recover mode)

recover.exe –S 1021210946 –o VADP:host=VC_Svr;VADP:transmode=san

Note that if you want to recover a VM back to a different vCenter,Datastore,ESX host and/or different resource pool, you can do that from the recover command too, rather than waiting to do it using the vsphere client.  this can be used if your vm still exists in vmware and you don’t want to overwrite it.  You can additionally specify VADP:host=  VADP:datacenter=  VADP:resourcepool=  VADP:hostsystem= and VADP:datastore= fields in the recover command, separated by semicolons and no spaces.

I’ve found that whilst the minimal command above may work on some environments, others demand a far more detailed recover.exe command with all VADP parameters set before it’ll communicate with the VC.  A working example is shown below (with each VADP parameter separated on a newline for readability – you’ll need to put it into a single line, and remove any spaces between each .

recover.exe -S 131958294 -o

VADP:host=vc.fqdn;

VADP:transmode=san;

VADP:datacenter=vmware-datacenter-name;

VADP:hostsystem=esxihost.fqdn;

VADP:displayname=VM_DISPLAYNAME;

VADP:datastore=“config=VM_DataStore#Hard disk 2=VM_DataStore_LUN_Name#Hard disk 1=VM_DataStore_LUN_Name”;

VADP:user=mydomain\vadp_user;

VADP:password=vadp_password

Creating new DataDomain Devices in Networker

In Networker Administrator App from NMC Console, Click Devices button at the top.
Right click Devices in the Left hand pane, New Device Wizard (shown)

Select Data Domain, Next, Next

 Use an existing data domain system
Choose a data domain system in the same physical location to your backup server!
Enter the Data Domain OST username and password

Browse and Select
Create a New Folder in sequence, e.g. D25, tick it.

Highlight the automatically generated Device Name, Copy to clipboard (CTRL-C), Next

Untick Configure Media Pools (label device afterwards using Paste from previous step), Next

Select Storage Node to correspond with device locality from “Use an existing storage node”, Next

Agree to the default SNMP info (unless reconfiguration for custom monitoring environment is required), Next

Configure, Finish

Select new device (unlabelled, Volume name blank), right click, Label

Paste Device Name in clipboard buffer (CTRL-V)
Select Pool to add the Device into, OK.

Slow backups of large amounts of data to DataDomain deduplication device

If you have ridiculously slow backups of large amounts of data, check in Networker NMC to see the name of the storage node (Globals2 tab of the client configuration), then connect to the DataDomain and look under the Data Management, DD Boost screen for “Clients” of which your storage node will be one.  Check how many CPU’s and Memory it has.  e.g. Guess which one is the slow one (below)

Then SSH to the storage node and check what processes are consuming the most CPU and Memory (below)

In this example (above), despite dedicating a storage node backup a single large applications data, the fact that it only has 4 cpu’s and is scanning every file that ddboost is attempting to deduplicate means that a huge bottleneck is introduced.  This is a typical situation whereby decommissioned equipment has been re-purposed.

Networker Server

ssh to the networker server and issue the nsrwatch command.  It’s a command line equivalent to connecting to the Enterprise app in the NMC and looking at the monitoring screen.  Useful if you can’t connect to the NMC.

Blank / Empty Monitoring Console

If you’re NMC is displaying a blank monitoring console, try this before restarting the NMC…

Tick or Un-tick and Re-tick Archive Requests.

monitoring-refresh

Tape Jukebox Operations

ps -ef | grep nsrjb     -Maybe necessary to kill off any pending nsrjb processes before new ones will work.

nsrjb -C | grep <volume>    -Identify the slot that contains the tape (volume)

nsrjb -w -S <slot>      -Withdraw the tape in slot <slot>

nsrjb -d       -Deposit all tapes in the cap/load port into empty slots in the jukebox/library.

Note:  If you are removing and replacing tapes you should take note what pools the removed tapes belong it and allocate new blank tapes deposited into the library to the same pools to eliminate impact on backups running out of tapes.

Exchange Backups

The application options of the backup client (exchange server in DAG1 would be as follows

NSR_SNAP_TYPE=vss

NSR_ALT_PATH=C:\temp

NSR_CHECK_JET_ERRORS=none

NSR_EXCH2010_BACKUP=passive

NSR_EXCH_CHECK=no

NSR_EXCH2010_DAG=GB-DAG1

NSR_EXCH_RETAIN_SNAPSHOTS=no

NSR_DEVICE_INTERFACE=DATA_DOMAIN

NSR_DIRECT_ACCESS=no

Adding a NAS filesystem to backup (using NDMP)

Some pre-reqs on the VNX need to be satisfied before NDMP backups will work.  This is explained here

General tab

general-tab

The exported fs name can be determined by logging onto the VNX as nasadmin and issuing the following command

server_mountpoint server_2 -list

Apps and Modules tab

apps_modules_tab

Application Options that have worked in testing NDMP Backups.

Leave datadomain unticked in Networker 8.x and ensure you’ve selected a device pool other than default, or Networker may just sit waiting for a tape while you’re wondering why NDMP backups aren’t starting!

HIST=y
UPDATE=y
DIRECT=y
DSA=y
SNAPSURE=y
#OPTIONS=NT
#NSR_DIRECT_ACCESS=NO
#NSR_DEVICE_INTERFACE=DATA_DOMAIN

Backup Command

nsrndmp_save -s backup_svr -c nas_name -M -T vbb -P storage_node_bu_interface 

or don't use -P if Backup Server acts as SN.

To back up an NDMP client to a non-NDMP device, use the -M option.

The value for the NDMP backup type depends on the type of NDMP host. For example, NetApp, EMC, and Procom all support dump, so the value for the Backup Command attribute is:

nsrndmp_save -T dump

Globals 1 tab

globals1

Globals2 tab

globals2

List full paths of VNX filesystems required for configuring NDMP save client on Networker (run on VNX via SSH)

server_mount server_2

List full paths required to configure NDMP backup clients (emc VNX)

server_mount server_2

e.g. /root_vdm_2/CYBERFELLA_Test_FS

Important:  If the filesystem being backd up contains more than 5 million files, set the timeout attribute to zero in the backup group’s properties.

Command line equivalent to the NMC’s Monitoring screen

nsrwatch

Command line equivalent to the NMC’s Alerts pane

printf "show pending\nprint type:nsr\n" | /usr/sbin/nsradmin -i-

Resetting Data Domain Devices

Running this in one go if you’ve not done it before is not advised.  Break it up into individual commands (separated here by pipes) and ensure the output is what you’d expect, then re-join commands accordingly so you’re certain you’re getting the result you want.  This worked in practice though.  It will only reset Read Only (.RO) devices so it won’t kill backups, but will potentially kill recoveries or clones if they are in progress.

nsr_render_log -lacedhmpty -S "1 hour ago" /nsr/logs/daemon.raw | grep -i critical | grep RO | awk {'print $10'} | while read eachline; do nsrmm | grep $eachline | cut -d, -f1 | awk {'print $7'}; done | while read eachdevice; do nsrmm -HH -v -y -f "${eachdevice}"; done

Identify OS of backup clients via CLI

The NMC will tell you what the Client OS is, but it won’t elaborate and tell you what type, e.g. Solaris, not Solaris 11 or Linux, not Linux el6.  Also, as useful as the NMC is, it continually drives me mad how you cant export the information on the screen to excel.  (If someone figures this out, leave a comment below).

So, here’s how I got what I wanted using the good ol’ CLI on the backup server.  Luckily for me the backup server is Linux.
Run the following command on the NetWorker server, logging the putty terminal output to a file:

nsradmin
. type: nsr client
show client OS type
show name
show os type
p

This should get you a list of client names and what OS they’re running according to Networker in your putty.log file.  Copy and paste the list into a new file called mylist.  Extract just the Solaris hosts…

grep -i -B1 solaris >mylist
grep name mylist | cut -d: -f2 | cut -d\; -f1 >mysolarislist

sed 's/^ *//' mysolarislist | grep -v \\-bkp > solarislist

You’ll now have a nice clean list of solaris networker client hostnames.  You can remove any backup interface names by using

grep -v b$

to remove all lines ending in b.

One liner…

grep -i -B1 solaris mylist | grep name | cut -d: -f2 | cut -d\; -f1 | sed 's/^ *//' | grep -v \\-bkp | grep -v b$ | sort | uniq > solarislist

Now this script will use that list of hostnames to ssh to them and retrieve more OS detail with the uname -a command.  Note that if SSH keys aren’t set up, you’ll need to enter your password each time a new SSH session is established.  This isn’t as arduous as it sounds.  use PuTTY right click to paste the password each time, reducing effort to a single mouse click.

#!/bin/bash

cat solarislist | while read eachhost; do
 echo "Processing ${eachhost}"
 ssh -n -l cyberfella -o StrictHostKeyChecking=no ${eachhost} 'uname -a' >> solaris_os_ver 2>&1
done

This generates a file solaris_os_ver that you can just grep for ^SunOS and end up with a list of all the networker clients and the full details of the OS on them.

grep ^SunOS solaris_os_ver | awk '{print $1 $3 $2}'
Did you like this?
Tip cyberfella with Cryptocurrency

Donate Bitcoin to cyberfella

Scan to Donate Bitcoin to cyberfella
Scan the QR code or copy the address below into your wallet to send some bitcoin:

Donate Bitcoin Cash to cyberfella

Scan to Donate Bitcoin Cash to cyberfella
Scan the QR code or copy the address below into your wallet to send bitcoin:

Donate Ethereum to cyberfella

Scan to Donate Ethereum to cyberfella
Scan the QR code or copy the address below into your wallet to send some Ether:

Donate Litecoin to cyberfella

Scan to Donate Litecoin to cyberfella
Scan the QR code or copy the address below into your wallet to send some Litecoin:

Donate Monero to cyberfella

Scan to Donate Monero to cyberfella
Scan the QR code or copy the address below into your wallet to send some Monero:

Donate ZCash to cyberfella

Scan to Donate ZCash to cyberfella
Scan the QR code or copy the address below into your wallet to send some ZCash: