node_hw_test - perform individual node benchmarks, write results to files
diag
node_hw_test [--help] [--debug] [--net] [--cpu] [ [--mem] | [--longmem] ] [--disk]
Performs individual node benchmarks to try and identify faulty HW.
Currently Tests for: Ping Netperf Reported system memory Memtest Memory ECC Errors (Tsunami chipset errors) Linpack (single node) STREAM
Results are stored in /cluster/tmp/node_hw_tests (written over NFS)
--db <dbname> Filename of cluster database to use. (NOT CURRENTLY USED)
--help Print manpage.
--net Run ethernet tests, takes about 1 minute. (ping, netperf) IMPORTANT: Make sure you only run the --net option on one node at a time for optimal results. Also, ``netserver'' must be started on the nodes boothost for netperf to work.
--cpu Run processing performance tests, takes about 5 seconds. (linpack, floating point checker)
--longcpu Run additional performance benchmarks. (livermore loops, NAS Kernels)
--mem Run memory tests, takes about 1 minute. (/proc/meminfo, stream, tsunami_machine_check)
--longmem Run memory stress tests, takes about 10 minutes for each 128M RAM. Also runs ``--mem'' tests above. (memtest)
--disk Test disk io performance. (not currently implemented)
MUST BE RUN ON NODE ITSELF!
/cluster/tmp/node_hw_tests/*
Paths and other defaults are recorded in CConf.pm.
node_hw_analyze show_disks show_temp