DOE's Massively Parallel Computing Research Laboratory at Sandia National
Laboratories (the MPCRL) has two Intel Paragons, an 1824-processor Paragon
named acoma which is used for large calculations, and a 54-processor
Paragon named zia which is used for code development and testing. Most
of the compute nodes on these machines run the SUNMOS operating system,
which is a light-weight, high-performance operating system. New users
should start their use of the MPCRL Paragons by testing their codes on
zia, even if they have successfully run their codes on Paragons at other
sites. This is especially true if the codes have not been run under the
SUNMOS or Puma operating systems before. Instructions on how to start
using SUNMOS are provided later in this document and in separate documents.
Accessing the Paragons from the Internet
If you are logging into the Paragons from anywhere but building 980, then the entry point is at the machine cs.sandia.gov (132.175.13.2), which is a gateway machine only. Once you are authenticated (see the documentation for the firewall elsewhere in this archive), then you can connect to one of the Sun compile servers or directly to one of the the Paragons. Their names are:
The Sun compile servers: jemez, pajarito, siesta, and chimayo
The large Paragon: acoma
The small development Paragon: zia
Cross-Compiling on the Suns at the MPCRL
The Paragons are NOT used for compiling, and should only be used for loading, running, and debugging Paragon programs. All compiling must be done on the Suns. There are several machines which are available as compute servers including jemez and pajarito (see the list above).
Several environment variables must be defined in order to cross-compile. In the default .cshrc file created when you get your account there is one single variable you can change to setup this entire environment. The variable is:
set IPARAGON_USER = no # Do you use the Intel/Paragon system
Set this variable to yes instead of no to use the Paragons. This will set up the environment for the Paragon. If you have a custom .cshrc file then look at the way the generic files are configured in /Net/local/sys. In particular, look at Cshrc and Cshrc.common.
This will set up an alias "pman" for Paragon-specific man pages.
Operating Systems and Compilers
The Sandia Paragons run a mixture of two operating systems: OSF and SUNMOS/PUMA (PUMA is the SUNMOS replacement). Most or all of the compute nodes will be running SUNMOS/PUMA, but there may also be OSF compute nodes. There will also likely be OSF IO nodes, OSF service nodes, and possibly SUNMOS/PUMA IO nodes. Once you have logged in to one of the Paragons, you can display a map of the current node configuration with the showparts command.
The C compiler for OSF on the Suns is 'icc'. The FORTRAN compiler is 'if77'. If you set the variable IPARAGON_USER to yes the compilers will automatically be added to your path. Compilers for the SUNMOS operating system (on the compute nodes) have an s in front of the names, e.g., sicc is the C compiler and sif77 is the Fortran 77 compiler.
Also note, that if you are both a IGAMMA_USER and a IPARAGON_USER, there
are major conflicts in the cross-development paths. For example, you use
'icc' for C cross-compilation for both the I/Gamma and I/Paragon machines.
We have yet to come up with a good solution to this problem; suggestions
are welcome. The only "solution" we have so far is that users should have
two shell scripts, e.g., dogamma.sh and doparagon.sh, that each contain
the relevant sections of the .cshrc file. If you work on the I/Paragon
machine you run doparagon.sh first which alters the path variable so that
the I/Paragon compilers are found first in the search list. Similarly for
I/Gamma work.
Paragon OSF Environment
Users log into the "service partition" to run their shell, control their jobs, and interact with the machine. The service partition runs a variation of UNIX called OSF-1/AD that supplies a "single system image" for the entire partition, which means that for the user it looks like a single process space. When you start a process (such as your shell, or an ls), it will run on one of the processors of the service partition, but look to you like you are dealing with a single processor.
Other nodes in the system are configured as I/O nodes, but most nodes in the system are designated as compute nodes. The compute nodes may run either OSF or SUNMOS (a lightweight operating system developed jointly by Sandia and UNM). This section will discuss only the OSF compute nodes, and further information on SUNMOS will be given later.
The Paragons at the MPCRL are configured so users cannot make partitions. You must use the application 'pexec' to run all OSF applications on the Paragons. The basic usage is:
pexec "your_app [your_options]" -sz num_nodes
The name of the application must be the second parameter on the command line to pexec. Any options to the application must also be part of this second parameter, hence the application name and all its options must be enclosed in double quotes. The last thing on the command line is the number of nodes on which pexec will run your application. There is a man page for pexec. It is available on the Paragons and the MPCRL Suns.
There is a debugger for the Paragon, 'ipd'. In order to use ipd you must execute another script which automatically creates a partition. In order to use ipd, type in the following:
pipd -sz num_nodes
where num_nodes is the number of nodes you wish to use. There is no man
page for pipd, but there is a man page for ipd.
Paragon Hardware Configuration
acoma is currently configured with approximately
1328 16-megabyte compute nodes
512 32-megabytes compute nodes
64 I/O nodes, each with a 5-gigabyte SCSI RAID device.
8 service nodes
3 HIPPI nodes
3 Ethernet nodes
zia is currently configured with approximately
40 SUNMOS compute nodes
8 Puma compute nodes
6 OSF compute nodes
4 OSF service nodes
1 OSF IO nodes with attached RAIDS
1 HIPPI node
1 Ethernet node
The exact configuration of zia is subject to change to accommodate special development needs and because it is sometimes cannibalized to get nodes for other machines.
Each node has two i860XP processors. Intel rates the machine as a 140 Gflop peak, but this counts only one of the processors since the second processor was intended for communication only. zia is a single cabinet. acoma is grouped in 30 cabinets, each capable of holding 64 nodes (an additional cabinet has been separated out as a development machine for the SUNMOS/Puma group). The node slots are connected by a two-dimensional mesh communication network, with links in each direction capable of communicating at up to 200 megabytes/second. acoma is physically a 16x120 mesh, of which the compute partition logically consists of a 16x115 mesh (there are columns of I/O nodes in the middle). acoma has a total of approximately 37 gigabytes of DRAM. Total disk space on acoma is approximately 330 gigabytes, although considerably less is currently available to users without prior arrangement, and some is dedicated to paging. Users can write data files to the directories /raid/io_##/tmp, where ## is currently an integer from 1 to 32 on acoma, and 1 on zia. Higher performance is obtained on acoma by writing to the directories /pfs/io_##/tmp.
These configurations are subject to change in the future. You can
determine what the current configuration is (as well as what jobs are
running) by using the showparts utility. Try the command
showparts -s
on acoma.
Paragon SUNMOS/Puma Environment
You are likely to find a majority (if not all) of the compute nodes running SUNMOS. You can find information about SUNMOS in the directory named ~ftp/pub/sunmos/doc/postscript on ftp.cs.sandia.gov, accessible via anonymous ftp from outside building 980, and in the directories /Net/local/sunmos/current/doc on the machines inside building 980. Of particular interest are the files user_guide.ps and nxemulation.ps. Most NX codes written for the Delta, iPSC/860, Paragon, or nCUBE 2 will run without modification under SUNMOS. Among other things, SUNMOS/PUMA
SUNMOS documentation is much briefer than OSF documentation, and consists
of the files mentioned above and a few man pages.
Location of Executables, Input Files and Output Files
Owing to a hardware limitation, NFS-mounted file systems should not be
used for
These files should be located located on acoma's RAID system, not on NFS-mounted file systems. Running executables from, reading data from, or writing data to NFS-mounted file systems greatly increases the instability of acoma, and may result in failure of your job.
Do not use NQS job scripts to move executables and input files to, or data files from, the RAID system to NFS-mounted file systems. These activities will also increase the instability of the machine.
Executables and input files should be pre-positioned on either the /raid or /pfs systems on acoma. For example, create a working directory on disk 5
> mkdir /raid/io_05/tmp/your_login_id
for executables and scalar input files.
NQS Usage
In the interests of brevity, this is described elsewhere (see the
separate document for NQS). Jobs on the large Paragon are run under
control of NQS during off-peak hours (currently 5pm-8am Mountain time
on week nights and all weekend). In addition the machine is configured
with a large number of nodes under NQS control during the day. Useful
commands for NQS are qstat, qsub, and qdel.
OSF Documentation
PostScript copies of current documentation is available in /usr/iparagon/current/Paragon/ps.docs on the Suns. Non-Sandians are not allowed to print these manuals on Sandia printers, but you may pull them back to your remote site and print them. You may also contact Intel SSD to purchase documentation. Try calling Dave Ellis at (503) 629-7726. Failing this, try calling Intel SSD at (503) 629-7600. The documentation is packaged as a "C Documentation Package" or "Fortran Documentation Package", and contains some combination of the following manuals (ask SSD for the details).
The most useful manuals are probably:
ptdg.ps Technical Documentation Guide (an expanded version of this list) psug.ps Paragon System User's Guide
xps.R.rn.ps Paragon System Software Release 1.1 Release Notes psrm.cmds.ps OSF commands reference manual
Other potentially useful documents:
pcug.ps C compiler user's guide psrm.c.ps C system calls reference manual pflrm.ps Fortran Language reference manual pfug.ps Fortran compiler user's guide
psrm.ftn.ps Fortran System Calls Reference Manual
pipdman.ps ipd debugger manual
xpsasm.ps i860 Assembler reference manual
pnqsman.ps nqs manual
pbmlpr.ps Paragon Basic Math Library Performance Report make.ps gnu make manual pglug.ps graphics libraries user's guide
Other documents:
ptug.ps Paragon Application Tools User's Guide (SPV, Paragraph, etc) xpsfddi.ps FDDI Installation and Configuration Guide
xpshim.ps Paragon Hardware Installation Manual psag.ps Paragon System Administrator's guide pspg.ps Paragon Site preparation guide pspvug.ps SPV user's guide (SPV == System Performance Visualization) gdb.ps gdb user's guide
xpshippi.ps Paragon HIPPI Interface Manual
xpsraid.ps Raid Utilities Guide
xpsdiag.ps Paragon Diagnostic Reference Manual
In addition, the directory /usr/iparagon/current/Paragon/release_notes contains the following files:
xps.R.rn.ps Paragon System Software Release Notes ss_install.ps Paragon system software installation instructions ss_buglist ASCII list of system software bugs ss_buglist.ps PostScript list of system software bugs ss_fixed ASCII list of fixed system software bugs ss_fixed.ps PostScript list of fixed system software bugs pcug.R.rn.ps Paragon C Compiler Release Notes icc_buglist.ps PostScript list of C compiler bugs pfug.R.rn.ps Paragon Fortran Compiler Release Notes
if77_buglist.ps PostScript list of Fortran compiler bugs xpsdiag.R.rn.ps Paragon Diagnostic Release Notes diag_buglist ASCII list of diagnostic bugs diag_buglist.ps PostScript list of diagnostic bugs
pdgl.R.rn.ps Paragon Distributed Graphics Library Release Notes osf1errata.ps OSF/1 Documentation Errata
If you have questions that are not addressed by the documentation, please e-mail them to iparagon@cs.sandia.gov. This is a distribution list so that several knowledgeable people may answer the question. Sandia has four on-site Intel System 2 representatives. You may contact them during normal working hours (8:00am to 5:00pm MT):
Ken Lord kmlord@cs.sandia.gov 505-845-2817 Applications level support
Bob Moore rlmoore@cs.sandia.gov 505-844-7559 Hardware and OS level support
Robert Fugatt rjfugat@cs.sandia.gov 505-845-7998 Hardware and OS level support
Tony Ralph tonyr@cs.sandia.gov 505-845-7998 Hardware and OS level support
In addition, you may contact the following Sandians during normal working hours:
Michael Hannah mjhanna@cs.sandia.gov 505-845-8923 Hardware and system support
David Gardner drgardn@cs.sandia.gov 505-845-7875 Scheduling and conflict resolution
There is also a mailing list to reach Paragon users at iparagon-users@cs.sandia.gov. This should be used only for information that all users will benefit from - NOT JUNK MAIL.