############################################################################# # # This Cplant(TM) source code is the property of Sandia National # Laboratories. # # This Cplant(TM) source code is copyrighted by Sandia National # Laboratories. # # The redistribution of this Cplant(TM) source code is subject to the # terms of the GNU Lesser General Public License # (see cit/LGPL or http://www.gnu.org/licenses/lgpl.html) # # Cplant(TM) Copyright 1998, 1999, 2000, 2001, 2002, 2003, 2004 # Sandia Corporation. # Under the terms of Contract DE-AC04-94AL85000, there is a non-exclusive # license for use of this work by or on behalf of the US Government. # Export of this program may require a license from the United States # Government. # ############################################################################# HOWTO install gm/mpich for myrinet on a CIT cluster: PREPARATION: 1. get the CIT myrinet module if you don't already have it. (From the website, cvs, a tarball, whatever...) Place it in the $CIT_DIST directory, the subdir name should be "myrinet". 2. edit the top level Makefile: Set GM_LINUX_KERN_SRC to point to the linux kernel source dir. - Must be the same version as the compute nodes - For Alpha nodes, must be configured with "Generic" processor type. - Must have the kernel headers set up, so run "make". Set VM_NAME to the name of the virtual machine you are creating (or let it default to "gm-1.4" Set GM_ARCH to the cpu-ostype for the compute nodes (or let it default to "alpha-linux") See the GM documentation for more info. 3. make the default modules as outlined in the main INSTALL document (# make) 4. make the myrinet module: # make myrinet 5. Continue with the main INSTALL instructions. AFTER CUSTOMIZING SYSTEM IMAGE AND BEFORE CLONING: 1. "make myrinet" should have chosen a mapper node for you. To change it: Edit /cluster/vms//config/mappernode 2. # make myrinet_gm 3. Set up files in /dev needed by GM: These steps only need to be done if the /dev/gm* devices were not created in the bootable system hierarchy. For a diskless image: # /cluster/bin/install_gm /cluster/machine/rh-6.2-alpha/image For a single node: # rsh /cluster/bin/install_gm 4. Add compute nodes to the new virtual machine with "set_vmname" Now, build the diskless hierarchy (build_diskless) or clone the image and boot the diskfull nodes. Make sure GM is loading properly on the nodes: Check "dmesg" and "/sbin/lsmod". Run '/cluster/rte/gm/bin/gm_board_info' AFTER ENSURING THAT GM IS CONFIGURED CORRECTLY: 1. BE SURE that a command like: # rsh `hostname` uptime works _without_ a password. (it's a dumb requirement of mpich, you can undo it after installing) 2. Configuring MPICH is too tricky to automate right now, but running the following will make a guess for you: # make myrinet_mpich_config 3. # cd myrinet/3rdparty/mpich-1.2..5 4. # ./configure Check the output for error messages! 5. # cd ../../.. 6. # make myrinet_mpich_make Verify the compile completed and programs got installed: # ls /cluster/vms/gm-1.4/mpich/bin (should list ~15 files) # ls /cluster/vms/gm-1.4/bin/mpptest (file should exist) 7. # make myrinet_tests WARNING: the Makefile for these is fairly architecture specific right now. If you are compiling for other than Linux_Alpha, they will need tweaking. Yes, fixing this is on the list of things to do.. someday. Verify the compile completed and programs got installed: # ls /cluster/vms/gm-1.4/bin (should list ~12 files) 8. Done, now go map the network and start running jobs! For instructions on mapping the network, testing the myrinet, and running MPI programs, look in the 'doc' directory. ---------------------------------------------------------------- NOTES: LAM (Local Area Multicomputer) is an MPI environment that may conflict with MPICH, and is part of the RedHat distribution. You may want to remove it. Run 'rpm -qa | grep lam', 'rpm -qi lam-6.3.1-4', 'rpm -e lam-6.3.1-4'