NAME

node_hw_analyze - parse through the results of node_hw_test and check them against known tolerances


MODULE

diag


SYNOPSIS

node_hw_analyze [--help] [--debug] [--net] [--cpu] [ [--mem] | [--longmem] ] [--disk] [--summarize] [--quiet] [--report] <device|collection>...


DESCRIPTION

After Node HW testing, node_hw_analyze is used to search through the results and check them against known tolerances.

Currently Tests for: Ping Netperf Reported system memory Memtest Memory ECC Errors (Tsunami chipset errors) Linpack (single node) STREAM

Results are gathered from /cluster/tmp/node_hw_tests


OPTIONS

--summarize Print short version of results (average, min, max)

--report Create tab delim spreadsheet-importable data. Does not work with single node. A separate file is created for each node type, named /cluster/tmp/node_hw_tests/node_hw_report.tab.<node type>

--quiet Supress printing of errors to stdout. Useful with "--summarize" and "--report"

--db <datasource> Database type and connection information. For GDBM, "GDBM:" followed by the filename of the cluster database to use. For LDAP, the syntax is "LDAP:host:port:dbname"

--help Print manpage.

--net Analyze results of ethernet tests. (ping, netperf)

--cpu Analyze results of processing performance tests. (linpack)

--longcpu Analyze results of extended cpu tests. Also checks "--cpu" tests above. (nasker, lloops)

--mem Analyze results of memory tests. (/proc/meminfo, stream, tsunami_machine_check)

--longmem Analyze results of memory stress tests. Also checks "--mem" tests above. (memtest)

--disk (not currently implemented)


NOTES

This script should be run from the admin node


FILES

/cluster/tmp/node_hw_tests/* /cluster/machine/data/node_hw_values.* /cluster/tmp/node_hw_tests/node_hw_report.tab*

Paths and other defaults are recorded in CConf.pm.


SEE ALSO

node_hw_test show_disks show_temp