There’s a nasty little surprise in store for anyone running a script using nohup in bash (Bourne Again Shell) on Solaris (to prevent hangup signals from killing their script/processes in the event their ssh session gets disconnected).
The Solaris default shell is csh (C Shell) in /bin/sh and even if the script has #!/bin/sh at the top, if its run from bash with nohup, it won’t survive in the event the ssh session is disconnected. Be warned.
This is because of bash’s built in job control apparently. I haven’t delved any deeper than that, but i’m guessing the process isn’t immune to sighups so aren’t protected from termination by nohup.
scripts must be executed from within a regular csh shell, i.e.
root@solarisbox ~ $ nohup /full/path/to/myscript.sh &
if they are to survive a disconnected puTTY session.
In summary, if you’re working remotely, and kick off scripts over a remote session that will run for a long time on solaris servers, don’t use bash.
To determine whether your script is still running when you manage to reconnect, use ps -ef | grep myscript.sh
Thanks for responding, Matthew. I guess I was just little disappointed too as when I was doing support, THIS was exactly the kind of problem tat I loved getting my teeth in to and then resolving.
A bit of DTrace watching the the sshd (autocorrect wanted to replace sshd with sushi) and the forked shells would have proved interesting.
As someone who did Solaris Support for a very long time, I can tell you that csh has not been the solaris shell in even longer Te prompt you show (a $) is more likely to be sh, ksh, or bash.
When using ssh to kick off things that will persist, you also need to ensure you use ‘-n’ to redirect input.
As to solaris using other shells to run things even with the #! at the top. Sorry, no it doesn’t.
I’d be interested in knowing whether or not you let anyone in Solaris Support know about it and if a bug was logged; as well as what kind of troubleshooting was done.
Even though I’m no longer in that role, I’m tempted to have a look.
Can you tell me:
– what was the login shell of the account that you were connecting to?
– Can you verify that the script being run was not likely to want to take any stdin?
Please feel free to contact me by my email address. I’m responding to this as in my current role, this post was quoted to defend a position on something.
Thank you for your feedback Alan. This post like all my posts, are based on my personal experiences. It was very likely an old Solaris OS that I had this experience on and I was so miffed by it, I felt the need to post it and thus offer a solution that worked for me that might work for others who hit the same issue. That’s not to say there’s a better way or that the problem was not me all along. The IT industry is all about the latest and greatest but in the real world some people are still running 20 year old software. I don’t remember the exact circumstances as I’ve had many tasks working with many different technologies since then. I should have included more info I guess. In any case, stand up for what you believe in and don’t let others quote the internet like it’s a more credible source when creating a counter-argument. Remind them they don’t pay you for your expertise just to override you with free mis-information written by a stranger who is likely not as experienced as you in this area!