2008 Newsnotes
Previous
Newsnotes
DAKOTA 4.2 Delivers Advanced Probabilistic Methods, Improved Algorithmic Efficiency, and Rapid Integration
Version 4.2 of the DAKOTA software toolkit was released and deployed in November 2008. DAKOTA (http://www.cs.sandia.gov/dakota) is used broadly by academic, government, and corporate institutions for sensitivity analysis, uncertainty quantification, parameter estimation, and design optimization studies. Version 4.2 offers substantial advancements that enable efficient, robust analysis and design of critical systems in the presence of uncertainty. Specific algorithmic improvements include:
• Uncertainty quantification: new stochastic collocation method based on Lagrange polynomial interpolation and more scalable generalized polynomial chaos methods, extended Latin hypercube sampling distributions and incremental random sampling;
• Optimization: new bi-level, sequential, and multifidelity optimization under uncertainty algorithms based on stochastic collocation and polynomial chaos, new APPSPACK interface to directly handle linear/nonlinear constraints, generalization of efficient global optimization technique;
• Calibration: new capability for surrogate-based model calibration, improved support for model calibration under uncertainty and weighted nonlinear least squares;
• Framework: new radial basis function and moving least squares surrogates, more efficient evaluation cache, and model recursion refinements.
DAKOTA 4.2 provides significant usability improvements, including a newly designed input parser, additional method tutorials, and examples demonstrating coupling DAKOTA to parallel simulation codes for analysis. These examples will be used in upcoming training classes at several sites. There is also improved platform support for Macintosh and Windows. Finally, Version 4.2 allows more convenient and robust integration into other software libraries, such as Trilinos and Xyce, with special emphasis on efficiency for large-scale applications.
(Contact: Brian Adams)
November 2008
Gemini/SIERRA-Mechanics Coupling
Sandia has recently entered into a collaboration with the Navy to couple the modeling abilities of Sandia's SIERRA-Mechanics software with the Navy's Gemini software. SIERRA-Mechanics can simulate the elastic/plastic deformation of complex solid mechanics models under dynamic loads using explicit finite element methods. Gemini is a hydrocode that solves the Euler equations in a fluid domain using Godunov finite difference methods and was developed at the Naval Surface Warfare Center, Indian Head Division. It was determined that a standard staggered solution procedure would be used for the coupling algorithm in which information is exchanged between the two codes at every time step of the dynamic simulation. The goal of the project is to demonstrate the ability to use the massive parallel computing resources available in Sandia's capacity platforms with the coupled codes to facilitate the simulation of shock waves on solid structures. An added constraint are security concerns which precludes Sandia from having source code access to Gemini. To achieve the goal of coupling SIERRA-Mechanics and Gemini, a Sierra-Gemini coupler was written that allows Sierra-Mechanics to communicate with external processes through the standard coupler interface, thus allowing Gemini to be delivered to Sandia as a binary executable and maintaining a heightened level of security. An initial demonstration of a combined fluid/structure interaction of a plane shock wave on a cylinder in 2D is shown in the figure. This is the known as the Huang cylinder problem and is based on a solution to the wave equation. This computational results compare favorably with the analytic solution. Future work will involve verification and validation of the coupling to ensure the validity of the results obtained. This will include convergence analysis and comparisons to both small scale and large scale experiments.
 |
| The picture shows the effect of a plane shock wave (50:1 pressure ratio) interacting with a linear elastic circular cylinder.. Red indicates areas of high pressure. |
(Contact: James Overfelt)
November 2008
Impact of ALEGRA on Army Research Laboratory
Experimental Program
The ALEGRA shock physics code is currently being used regularly by the Army Research Laboratory (ARL) to simulate experiments for fundamental research and is helping guide their experimental program. Recently, ARL experimentalists were preparing to publish some new results on exploding wires when they asked ARL analysts to perform some simulations of the work using ALEGRA (see Figure 1). The simulated results were different enough that a detailed investigation was warranted. Additional ALEGRA simulations and experiments helped identify that the presumed experimental impedance mismatch between the oscilloscope and the feed was off by a factor of two. This discrepancy would probably not have been found without the help of the simulations. After the correction, the experiment showed improved consistency with the simulations (see Figure 2), which then allowed ARL to use the simulation results to start understanding the fundamental dynamics of the experiment. Remaining inconsistencies between experiment and simulation for these types of problems (see Figure 2) will be addressed in a four year program led by ARL that is focused on verification and validation. The goal of the program is to assess and improve the predictiveness of modeling and simulation tools used in ARL's research. The images are courtesy of Dr. B. Doney, ARL.


(Contact: Erik Strack)
October 2008
Next-generation Data Partitioning for Parallel Computing
Load balancing data among processors is crucial to the scalability of parallel codes. Software tools such as Zoltan provide partitioning algorithms that compute parallel data distributions. For matrix computations, the data partitioning is typically done in a one-dimensional (1D) fashion, i.e., the matrix is partitioned by rows (or columns, but not both). As part of research supported by the CSCAPES (Combinatorial Scientific Computing and Petascale Simulations) SciDAC institute, we have developed new sparse matrix partitioning algorithms that go beyond 1D (i.e., 2D) and reduce the communication requirement substantially compared to the 1D approach. We have studied a particularly important kernel in scientific computing, sparse matrix-vector multiplication, which is the crux of many iterative solvers. A new algorithm based on nested dissection (recursive substructuring) has been developed. Empirical experiments show that the method clearly outperforms 1D partitioning and is competitive (in quality) with other proposed 2D methods that have been deemed impractical since they are too expensive to compute. In contrast, our method takes similar time to compute as traditional graph or hypergraph partitioning. On a test set of sparse matrices from diverse applications like finite element computations, circuit simulation, and text processing (informatics), we observed an average reduction in (predicted) communication volume of 15% (for symmetric matrices) but up to 97% reduction in extreme cases. The largest gains were for applications with highly irregular structure, like electrical circuit models, informatics, and matrices from constrained optimization. We believe our data partitioning will be useful in a variety of algorithms, not just matrix-vector multiplication.
Our new partitioning algorithm is currently being implemented in the Isorropia package (supported by ASC). Isorropia provides partitioning and load-balancing services to Trilinos, mostly through an advanced Zoltan-Epetra interface. See http://trilinos.sandia.gov for more information. The planned 2009 release of Isorropia will additionally support the next-generation matrix partitioning methods, and will be easily accessible to Trilinos/Epetra users. This is joint work with Michael Wolf, U. of Illinois, Urbana-Champaign. We are interested in collaboration with application developers; please contact Erik Boman.
(Contact: Erik Boman)
September 2008
Mesh-Based Simulation of Complex Material MicrostructureMost engineering materials exhibit strong heterogeneity at the micro-scale such as polycrystalline and/or multi-phase structure, inclusions, voids, and micro-cracks. Much of the complex, nonlinear response observed in these materials originates at this length scale. The drive for achieving predictive simulation capabilities thus creates a strong need for methods to accurately model the effects of these microstructural features. Unfortunately, traditional mechanics simulation methods generally require conformal discretization, and producing a fitted mesh of acceptable quality is extremely challenging and time consuming for the limited configurations where it is possible at all. To overcome this challenge, we have developed the FlexFEM software, a three dimensional, parallel implementation of Heaviside enriched eXtended Finite Element Method (X-FEM) (Belytschko, 1999; Simone, 2006) for both transient dynamic and static applications in a coupled physics setting. FlexFEM makes extensive use of Trilinos solvers and utilities (see http://trilinos.sandia.gov). The X-FEM eliminates the need for conformal discretization without loss of accuracy, greatly simplifying microstructural analysis (Fig. 1). Further, Heaviside enriched X-FEM naturally incorporates an interface model for features such as cracks, grain boundaries, or domain walls, which dramatically increases the application space. A demonstration calculation on a random 3D polycrystal shows the strong effects of compliant grain boundaries on mechanical response (Fig. 2). In the calculation, the random microstructure is discretized trivially using a non-conformal, structured Cartesian grid. The FlexFEM software represents a major advance in modeling extremely heterogeneous systems. Development of this capability is continuing. Potential thermo-mechanics and multi-physics applications include: i) predicting material degradation with accumulation of interfacial damage due to thermal, mechanical and/or electrical cycling, ii) studying dynamic strength and spall in polycrystalline metals, and iii) simulating microstructurally engineered materials that are tailored for specific applications. Development of FlexFEM was funded by the CSRF element of the ASC program; the development team consists of Joshua Robbins (1435) and Thomas Voth (1433). Figure 1: Example polycrystalline configuration. Note that the mesh does not conform to the microstructure. Figure 2: Uniaxial tension of a model polycrystal. Color indicates displacement. Displacements are magnified by 10x for illustration. Note strong discontinuities at grain boundaries.
(Contact: Joshua Robbins)
August 2008
Peridynamics in LAMMPS
Peridynamics is a generalized continuum theory that employs a nonlocal model of force interaction to describe material properties. In this context, nonlocal means that continuum points separated by a finite distance may exert force upon each other. This is accomplished by replacing the local stress/strain relationship of classical elasticity by a nonlocal integral operator that sums forces over particles separated by a finite distance. This integral operator is not a function of the deformation gradient, allowing for a more general notion of deformation than classical elasticity. Further, the nonlocality of the model makes it suitable as a multiscale material model, as its behavior varies in accordance to the length scale to which it is applied.
A particular meshless discretization of the peridynamic continuum model has the same general computational structure of molecular dynamics. This allowed its implementation within Sandia’s molecular dynamics package, LAMMPS, and allows users familiar with molecular dynamics to efectively simulate continuum materials.
The peridynamics extensions made to the LAMMPS package are available for download from the LAMMPS web site, http:\\lammps.sandia.gov. This code represents the only publicly available peridynamic code. (Contacts: Michael Parks, Rich Lehoucq, Steve Plimpton, and Stewart Silling)
 |
| (a) Cut view of target during impact by projectile |
| |
 |
(b) Top monolayer of brittle target showing fragmentation. |
Figure 1: This is a replication of an experiment described in (Silling, 2005) and also presented in (Parks, 2008). It simulates the impact of a rigid sphere on a homogeneous block of brittle material. The sphere has diameter 0.01 m and velocity of 100 m/s directed normal to the surface of the target. The target material has density 2200 kg/m3 and bulk modulus 14.9 GPa. The target is a cylinder of diameter 7.4 cm and thickness 0.25 cm. It was discretized as a 3D cubic lattice of particles with lattice constant 0.5 mm, and contains 103,110 particles. |
(Contact: Michael Parks)
August 2008
Sandia Demonstrates Quad-Core Catamount on Oak Ridge LCF’s Jaguar SystemSandia has completed the Catamount XT4 Risk Mitigation project by successfully running a quad-core version of the Catamount Light Weight Kernel Operating System on ASCR’s largest Cray XT4 system, Jaguar, located at the Oak Ridge Leadership Computing Facility. Four applications (GTC, VH1, POP, and AORSA) were tested at various job sizes for 24 hours. On average the Catamount performance was 3.8% better than Compute Node Linux. The percent improvement of Catamount over Compute Node Linux varied from –14% to 44%. In all cases, Catamount outperformed Compute Node Linux for the tests involving the higher core counts.
(Contact: Suzanne Kelly)
June 2008
Networks Grand Challenge Project Holds Inaugural External Advisory Board Meeting
The Networks Grand Challenge is a large Laboratory Directed Research and Development project focused on the analysis of large data sets with complex linkages. The project researchers are inventing new, scalable analysis methods which will be deployed to analysts at Sandia and elsewhere. The project is broad in scope, drawing on the talents of staff from six vice presidencies. The goal is to discover technology that can transform the analysis process, and thereby significantly impact national security.
The project team recently held a successful first meeting of its External Advisory Board (EAB). The members of the EAB are a mixture of leading academic and industrial researchers, along with application experts from the intelligence community. In their just released report, the members of the EAB laud the project for its vision and ambition and in their own words provide “an overwhelmingly positive evaluation”. They highlight a number of distinguishing features of the work including Sandia’s expertise in discrete math, graph algorithms and high performance computing, and also Sandia’s deep ties to the intelligence community. The Board particularly praised the project’s proactive focus on human factors considerations as pivotal to eventual impact.
In creating strong ties between technology and mission organizations, the Networks Grand Challenge hints at a future in which we collaborate to find news ways of providing exceptional service in the national interest.
(Contact: Bruce Hendrickson)
May 2008
UQ Methods in DAKOTA Employed by NuGET Team to Achieve Critical QMU MilestoneUncertainty quantification (UQ) algorithms are a critical capability to perform the Quantification of Margins and Uncertainty (QMU). The UQ needs of our users, coupled with the high cost of computational simulations, have led researchers in 1411 to develop more robust and efficient UQ algorithms, delivered in the DAKOTA framework. Second-order probability (nested sampling methods) is a differentiating capability in DAKOTA. Second-order probability allows one to propagate both aleatory (inherent variation) and epistemic (lack of knowledge) uncertainty. A common situation is where some uncertain inputs can be characterized by probability distributions, but other uncertain inputs can only be characterized with intervals (e.g., any value between an upper and lower bound is possible). In this case, the analysis is done with a nested sampling approach, where the outer loop sampling is over the epistemic variables and the inner loop sampling is over the aleatory variables. The results of second-order probability are a family or ensemble of cumulative distribution functions (CDFs), see Figure 1. Each CDF represents an inner loop sample conditioned on a possible value of the epistemic variables. The bounds on the entire family at a particular response threshold represent the epistemic uncertainty in where the true CDF value may fall. Second-order probability is being used in many Advance Simulation and Computing (ASC) milestones, for example, to assess epistemic ranges on margins at particular threshold levels.
The UQ algorithms in DAKOTA have played a crucial role in assessing uncertainties in stockpile materials, components, systems, and environments, and their effect on weapon performance, safety, and reliability. The Neutron Gamma Energy Transport (NuGET) team employed second-order probability methods for the ASC Level II Milestone titled “NuGET QMU Methodology.” The goal of this milestone was to assess the influence of both aleatory and epistemic uncertainties in hostile and fratricide scenario predictions. The second-order probability method played an important role in downselecting experiments and demonstrating compliance with Stockpile-to-Target Sequence (STS) requirements. The ensembles of CDFs enabled the calculation of interval bounds on margins and failure probabilities, and demonstrated how this methodology can identify components that might have possible problems or need additional analysis to reduce epistemic uncertainty.
To help educate users about the UQ methods in DAKOTA, classes on DAKOTA 4.1 were held at SNL/NM and SNL/CA in April, training 35 users. Additional classes are planned to meet the demands of a growing user base
(Contact:James Stewart)
May 2008
Petascale-ready version of CCSM's Atmospheric Model
The multi-lab SciDAC project, “Modeling the Earth System”, sponsored by the DOE's Office of Biological and Environmental Research, is focused on creating a first generation Earth system model based on the Community Climate System Model (CCSM). The envisioned Earth system model will require petascale computing resources, so the project is also working to ensure the CCSM is ready to fully utilize DOE’s upcoming petascale platforms. The main bottleneck to petascale performance in Earth system models is the scalability of the atmospheric dynamical core. Team members at Sandia, NCAR and ORNL, lead by Mark Taylor (1433), have thus been focusing on the integration and evaluation of new, more scalable, atmospheric dynamical cores (based on cubed-sphere grids) into the CCSM. They have recently developed a new formulation of the highly scalable spectral element dynamical core that locally conserves both mass and energy and has positive preserving advection. They have successfully integrated this dynamical core into the CCSM. This work allows the atmospheric component to use true two-dimensional domain decomposition for the first time, leading to unprecedented scalability demonstrated out to 96,000 processors with an average grid spacing of 25km. Even better scalability will be possible when computing with a global resolution of 10km, DOE's long term goal. The team has completed extensive verification work using standardized atmospheric tests with prescribed surface temperatures and without the CCSM land, ice or ocean models. As part of this work, they have performed detailed mesh convergence studies including a record setting simulation using 64,000 processors of BG/L. The team is currently focused on coupling with the other CCSM component models.
|
Illustration 1: The cubed-sphere grid used on each hybrid-pressure surface by the spectral element atmospheric model component of the CCSM. |
(Contact:Mark Taylor)
May 2008
DAKOTA 4.1 Extends Capabilities for Risk-Informed Decision MakingVersion 4.1 of the DAKOTA software toolkit was released and deployed this fall. DAKOTA is used broadly within the Tri-Lab Advanced Simulation and Computing (ASC) community for sensitivity, uncertainty, and design studies involving ASC simulations of NW components and systems. Version 4.1 deploys major new capabilities in uncertainty quantification (UQ), optimization, and optimization under uncertainty (OUU), which emphasize emerging needs in verification and validation (V&V) and risk-informed decision making.In particular,
- New UQ methods include generalized polynomial chaos expansions, efficient global reliability analysis, incremental sampling, and adaptive importance sampling; and extended UQ methods include second-order reliability, variance-based decomposition, and evidence theory. The new methods bridge a critical gap in accuracy and efficiency that has existed with current production methods and emphasize smart, adaptive approaches with verifiable accuracy.New optimization methods include multipoint hybrid methods, global surrogate-based optimization, efficient global optimization, and dynamic optimizer plug-ins; and extended optimization capabilities include trust-region surrogate-based optimization, DIRECT, OPT++, and AMPL. These new methods emphasize the efficient identification of globally-optimal designs.
- The optimization and UQ developments enable new OUU methods, including polynomial chaos-based OUU, global reliability-based OUU, epistemic OUU, and model calibration under uncertainty. These new methods enable risk-informed analysis and design.
These new DAKOTA algorithms are being applied to the probabilistic design of microsystems in order to define shapes that are both robust and reliable with respect to manufacturing uncertainties; see Figure 1.
|
|
Figure 1. Micrograph of bi-stable MEMS device (left) with optimized force-displacement profile (right). Prescribed reliability level for actuation force was achieved while reducing sensitivity to manufacturing uncertainties by an order of magnitude. |
DAKOTA is used with Sandia's high performance simulation codes such as Alegra, Xyce, and SIERRA, and is impacting Sandia mission areas in Defense Programs, Qualification Alternatives to the Sandia Pulsed Reactor (QASPR), Microsystems and Engineering Science Applications (MESA), High Energy Density Physics (HEDP), National Infrastructure Simulation and Analysis Center (NISAC), and others. DAKOTA is open-source and has approximately 4000 registered installations from sites all over the world. DAKOTA is led by Department 1411 and has contributors from across Centers 1400, 1500, 6600, and 8900. See http://www.cs.sandia.gov/DAKOTA/software.html
(Contact:Jim Stewart and Mike Eldred)
April 2008
Bundle-Exchange-Compute (BEC): A New Parallel Programming Environment
BEC (Beta version) was publicly released on April 9, 2008. A BEC Tutorial will be taught by Mike Heroux (1414) and Zhaofang Wen (PI, 1423) on April 10th to Sandia audience. BEC represents a new parallel programming model for the high-performance scientific application development. BEC is jointly developed by Sandia and Syracuse University (funded by Sandia/CSRF). Using BEC, Sandia parallel programmers can potentially increase their programming productivity by a factor of 3X or more. Sandia's large-scaled scientific applications need to run on high-end parallel computers, each consists of thousands of workstations interconnected together. A parallel application program needs to make all these workstations work together to solve a single (scientific) problem as fast as possible. To exploit the computing power of a parallel computer, computation work should be divided as evenly as possible among the workstations; but computation can not be done without data; so it is also important to partition the data among these workstations, and to move the data between the workstations ("communication") as needed by computation at the right moment ("synchronization"). For more than a decade, programmers at Sandia (and everywhere else) have been using a parallel programming environment called MPI. MPI application developers, often domain scientists, are forced to handle the difficult tasks of data partition, communication, and synchronization, which are low-level machines details totally irrelavent to their domain scientific expertise.
BEC frees the programmers from the low-level machine details and allows them to focus their applications and algorithms. Writing parallel programs will be much easier.
Comparison of the same applications using BEC and MPI show that the BEC and MPI programs have similar performance, but the BEC programs are much simpler and easier to write. The charts and table below is an example, a sparse linear solver using the Conjugate Gradient (CG) method with data from the diffusion problem on 3D chimney domain.Table 1 is a code size comparison of BEC vs. MPI, excluding empty, comments, debugging, and # lines. Figure 1 shows a comparison of the parallel execution time of the BEC and MPI programs.
Task
(in CG Application) |
Number of Lines of Code |
BEC |
MPI |
Computation related |
60 |
87 |
Communication related |
11 |
277 |
Whole program |
233 |
733 |
|
Table 1: Code size comparison: BEC vs. MPI |
|
Figure 1: Comparison of parallel execution time of the BEC and MPI programs |
(Contact: Zhaofang Wen)
April 2008
D-Fluids-DFT Calculations of Peptide Assemblies in BilayersCalculation of the structure of peptides and their assemblies is challenging when they are found in a small molecule (water) based electrolyte. The problem becomes much more difficult when the peptides are embedded in lipid bilayer membranes. In this case the dense fluid medium of the bilayer membrane makes statistical sampling of embedded peptide assemblies especially challenging. We have recently succeeded in first-of-a-kind calculations of the 3-dimensional structure of a fluid bilayer in the vicinity of an assembly of anti-microbial peptides (AMPs). In these calculations, we were able to predict the existence of membrane spanning pores at the intersection of an assembly of 6 AMPs. This is significant because the primary mode action of many AMPs is to increase the porosity of a cell membrane resulting in cell death. In order to solve these kinds of problems, we have developed the Tramonto software (see http://software.sandia.gov/tramonto). Tramonto computes density profiles both for fluid systems where surfaces cause the fluid to be inhomgeneous (e.g. fluids in zeolite) and for fluids where intra-molecular interactions result in self-assembled structures (e.g. fluid lipid bilayers found in cell membranes). The underlying theories solved in Tramonto include several types of modern nonlocal density functional theories that retain the length scale of a monomer. Thus a variety of coarse-grained models can be solved with Tramonto depending on the definition of the monomer. The solution of these complex theories is facilitated by specialized parallel solver methods that allow the code to run very efficiently on massively parallel computers. These calculations demonstrate an unprecedented level of complexity in modeling biological membranes such as cell boundaries that enables new scientific insight into the details of drug-membrane interactions with implications for new antibiotics and drug-delivery mechanisms.
While the discussion here focuses on a biological application, the larger story is the successful development of 3D Fluids-Density Functional Theory (DFT) capabilities. While the Quantum-DFT community is quite large and several well-established software packages exist, Tramonto is one of only a few codes capable of 3D Fluids-DFT calculations for predicting fluid structure. It is the only code that combines nonlocal theories that capture the monomer length scale with highly tuned parallel computing algorithms that include engineering analysis algorithms (such as the Trilinos-LOCA stability and bifurcation analysis tools, see http://trilinos.sandia.gov) for materials design. This software opens many opportunities for new kinds of investigations across many disciplines in materials modeling, nanoscience, and biology. The Tramonto team includes Laura Frink (Colder Insights, SNL contractor), Amalie Frischknecht (1814), Mike Heroux (1416), and Andrew Salinger (1414).
|
Figure 1: 3D density profiles from Tramonto calculation of AMPs in lipid bilayers. In both cases, the red to white contours show the solvent, the yellow to black contours show the lipid tail group, and the blue lines show contours for the head group species. On the left, the lipid bilayer exhibits nanoscale structure due to the presence of the AMPs as indicated by the horizontal yellow to black striping. However, the bilayer remains intact. On the right, a nanoscale pore forms in between the AMPs that form the assembly. Nanoscale solvent structure is observed (see red to white bead pattern that runs through the bilayer). In addition, head group densities are now nonzero at the interface of the lipid tails and the solvent in the nanopore. This toroidal structure arises naturally in the Fluids-DFT calculation. |
(Contact: Scott Collis)
April 2008
Material Failure Improvements in ALEGRA
The ALEGRA shock physics code is being used by the Army Research Laboratory (ARL) to simulate important experiments for advanced armor development. These simulations involve impact of a metal rod into targets of ceramic materials and metals, all subject to high strain rate and material failure. In early January, long-standing algorithmic issues with material failure modeling were causing early termination of these simulations at less than 20 microseconds.To address this situation, Erik Strack, Mike Wong, Ed Love, and Bill Rider (all 1431) collaborated in an intense effort to develop, implement and test improvements to these algorithms. The major improvements were implementation of a new void insertion model to handle excessive tensile stresses and extension of the isentropic multimaterial algorithm to accommodate void inserted during tension relief. The new void insertion model is intended to replace the existing pressure-dependent fracture model, which used a Newton iteration scheme to converge to a relaxation pressure. Under a number of circumstances observed in these simulations, however, the derivative of the pressure-density function from the equation of state was sufficiently inaccurate that the iteration would fail to converge, resulting in eventual failure of the calculation. The new model provides a less efficient but more robust backup iteration scheme and logic to detect conditions necessary to switch schemes. In addition, error checking, convergence tests and diagnostics were substantially improved.The void volume evolution is now treated in a manner compatible with both a modern multi-material treatment and the improved fracture algorithm. This resulted in additional enhancement to the robustness of the simulations and provided physically realistic and meaningful results.
These improvements were implemented in ALEGRA and resulted in successful ARL armor simulations running to completion at up to 180 microseconds, representing a major milestone that ARL has been striving to achieve since 2001. The accompanying figure shows ceramic failure patterns in agreement with those observed experimentally.
(Contact: Erik Strack, Mike Wong, Ed Love, and Bill Rider)
April 2008
Red Storm's 284 TeraFlop Upgrade: The Inside StoryOn February 5, 2008 a News Release was issued by Cray Corporation and publicized on HPCwire about the agreement between Sandia and Cray to upgrade our NNSA/ASC Red Storm system to 284 TeraFlops. This agreement was also described in a Feb 15, 2008 Sandia Lab News article. Our upgrade is scheduled for the summer of 2008 and will be the 2nd major upgrade to Red Storm. The first upgrade occurred in the Fall of 2006 and brought the system from an initial performance level of 41 TeraFlops to the current theoretical peak performance of 124 TeraFlops. A critical concern for all massively parallel supercomputers is scalability. The attention to interconnection network performance and scalable system software provides Red Storm very good application scalability; that is, the ability to have application performance scale up to entire system. The inside story implied in these recent news reports is a different dimension to supercomputer scalability. This 2nd upgrade to Red Storm exploits the ability to grow the system to take advantage of improvements in processor technology. The initial system used 10,368 AMD Opteron processors at 2.0 GHz; and the current system uses 12,960 AMD dual-core Opteron processors at 2.4GHz. This upgrade will replace about 48% of these dual-core processors with the latest generation of quad-core AMD Opterons at 2.2 GHz that give four floating point operations per clock versus the two floating point operations per clock of the current dual core and original single core Opterons. If the other 52% of the system were also upgraded, the theoretical peak performance of Red Storm would be 456 TeraFlops. Finally, these successive phases of processor upgrades are enabled by Sandia's collaboration with Cray to upgrade the Catamount system software to support dual-core and quad-core Opteron processors.
(Contact: James Ang)
March 2008
Nanoparticle Simulations with LAMMPS Sandia's molecular dynamics package LAMMPS has been modified to enable more efficient simulations of large-scale nanoparticle models which contain large variation in particle sizes, e.g. for modeling a nanoparticle suspension with background solvent. This required new algorithms for neighbor finding and inter-processor communication which are able to search for a minimum number of nearby particles without needless distance computations. For systems with a 20:1 size ratio between nanoparticles and solvent particles, the new version of the code is over 100x faster, in either serial or parallel.This is enabling large simulations of suspensions to measure rheological properties such as diffusion coefficients and viscosity. These are of interest in manufacturing processes such as extrusion and coating where suspensions are used to disperse nanoparticles and assemble nanostructured materials. This work has been funded by NINE (National Institute for Nano Engineering) and a funds-in CRADA with 5 companies interested in manufacturing issues for nano and colloidal suspensions.
The LAMMPS parallel molecular dynamics package is an open-source code distributed world-wide. Contributors to LAMMPS within Sandia are in centers 1100, 1400, 1500, and 1800. See lammps.sandia.gov for more details.
 |
A snapshot of nanoparticles in explicit solvent. |
(Contacts: Steve Plimpton or Scott Collis)
March 2008
New Solver Research Enables Simulations of Large Electrical Circuits
Recent solver research, focusing on matrix ordering algorithms and block-structured preconditioning, has significantly improved the capability to numerically simulate the response of large-scale electrical circuits for stockpile systems. Xyce, a massively-parallel circuit simulation code, is used to predict the electrical response of large, integrated circuits, particularly for hostile radiation environments. Xyce models individual circuit elements as nonlinear differential equations, which are assembled into a large system of nonlinear differential equations to form the full circuit problem. This set of equations must be implicitly integrated forward in time in order to determine the circuit response. The preconditioner research, led by David Day (1414) and Heidi Thornquist (1437), identified and implemented a permutation operation that re-orders the Jacobian matrix of the governing equations, leading to a block-structured and diagonally-dominant matrix that is amenable to preconditioning and efficient computation of iterative solutions.
Circuit models that include integrated circuit interconnect parasitics can result in poorly conditioned linear systems, which is traditionally challenging for iterative solvers. Typically this issue becomes more significant as integrated circuit feature sizes shrink, as parasitic capacitive and inductive effects are more likely to dominate overall circuit behavior.
For newer integrated circuit technologies traditional preconditioners often perform poorly or fail, so code developers and analysts have had to rely on direct solvers, which are memory-intensive and not scalable to very large numbers of processors. Reliance on direct solvers has therefore limited the size and complexity of circuit problems that Xyce could solve. The new preconditioner, implemented in the Trilinos framework, has enabled the use of scalable and efficient iterative solvers, thereby allowing for high fidelity simulations of parasitic effects in integrated circuits.
|
Figure 1. The reduced block connectivity matrix is shown for a Jacobian matrix from the transient simulation of a circuit with 109,345 unknowns and strong parasitics. Graph algorithms are used to find strongly connected blocks.
|
|
Figure 2. The reduced block connectivity graph from Figure 1 is shown here partitioned for four processors. The Isorropia package and Zoltan parallel load balancing utility were used to determine the reordering.
|
(Contacts: David Day or Heidi Thornquist)
February 2008
|