All machines have plenty of disk space, 64 GB of RAM and 32 core CPUs
(AMD Opteron Processor 8356).
At the time of the calculation these machines are performing pretty
heavy seismic calculations and the load on them would be around 20.
AFAICT memory is not an issue, the swap is barely used.
I am at a loss to find out why and how these restarts occur. Any advice
on how to analyse/diagnose this problem would be very much appreciated.
Please note also that the daemon in question is not started via an
/etc/init.d script but manually:
--
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam