Zoltan Developer's Guide | Next | Previous

Zoltan Quality Assurance

This document describes the Software Quality Assurance (SQA) policies and procedures used in the Zoltan project. Zoltan developers at Sandia or under contract to Sandia are required to follow these software development policies.
Quality Policy
Quality Definition
Classification of Defects
Release Policy
Software Quality Tools
Software Quality Processes
Zoltan´s implementation of the ASC Software Quality Engineering Practices

Quality Policy

Sandia´s ASC Quality Management Council (AQMC) developed and manages the Quality Assurance Program (QAP) for Sandia´s ASC program. The AQMC chartered the development of the Sandia National Laboratories Advanced Simulation and Computing (ASC) Software Quality Plan, Part 1: ASC Software Quality Engineering Practices, Version 2.0 document (SAND 2006-5998) as the practical SQA guidance for projects like Zoltan. A companion document, Sandia National Laboratories Advanced Simulation and Computing (ASC) Software Quality Plan, Part 2: Mappings for the ASC Software Quality Practices (SAND 2006-5997), shows how these practices satisfy corporate policies including CPR001.3.6, Corporate Software Engineering Excellence, and DOE/NNSA orders 414.1C and QC-1 rev 10.

The Zoltan project is committed to a program of quality improvement in compliance with the ASC Software Quality Engineering Practices document. The Zoltan Team Leader is the owner of the Zoltan quality system. Zoltan developers at Sandia or under contract to Sandia are required to follow these software development practices. The Zoltan team shall participate in all reporting processes, audits, and assessments as directed by the AQMC.

Quality Definition

QC-1 rev 10 defines quality as "the degree to which customer requirements are met."

The Zoltan project accepts the following definition of quality: "the totality of characteristics of a product or service that bear on its ability to satisfy stated or implied needs." This is Juran´s "fitness for use" definition of quality (ANSI/ASQC A8402-1994.) This superior definition of quality fully satisfies the QC-1 rev 10 definition. This definition is also more useful in a research environment where the requirements are derived from a research proposal rather than directly from customers and end users.

Classification of Defects

The Zoltan project accepts the following system of classification of defects:
Critical: A defect that could lead to loss of life, significant environmental damage, or substantial financial loss.
Major: A non critical defect that significantly impacts Zoltan's fitness for use.
Minor: A (non critical, non major) defect that reasonably impacts Zoltan's fitness for use.
Incidental: Any other defect which does not reasonably reduce Zoltan's fitness for use.

Release Policy

Only the Zoltan team leader may authorize (certify) a release. The Zoltan team leader shall not release software with any known critical or major defects. User registration shall allow the Zoltan team to notify all Sandia and ASC users and to recall their defective software if a critical or major defect is discovered after release. The Zoltan team leader may determine that it is acceptable to release software with known minor or incidental defects.

Software Quality Tools

Because of the small scale of the Zoltan Project, only a few, simple tools are required for use by Zoltan developers:

CVS: maintains code, documentation, meeting notes, emails, and QA program artifacts;
Purify, PureCoverage, Quantify (Rational), Valgrind, gdb: for dynamic code testing, coverage measurements, and performance analysis;
Bugzilla: tracks bugs, requests for changes, and enhancements;
Mailman: creates email lists to automatically notify users by area(s) of interest;
Makefiles: ensures proper compilation and linking for all supported platforms; and
Zoltan Test Script: runs integration, regression, release and acceptance testing.

Software Quality Processes

Bug Reporting, Issue Tracking, Enhancement Requests: All of these items are now directly entered into Bugzilla by developers and users. This "process" is built into the tool. Detailed instructions for using Bugzilla are found on the Zoltan web page. Bugzilla also provides query and report features for tracking the status of entered items;.

A process is defined as a sequence of steps performed for a given purpose (IEEE Std. 610.12.) Zoltan´s other processes are defined as checklists because checklists are one of the seven fundamental quality tools. These checklists are also the primary artifact created when following a process. Currently the following processes are defined:

Development: (not currently used) defines the software development process including requirements, design, implementation, testing, reviews, and approvals;
Release: defines the release process including testing requirements and creation of the release product;
Request: defines the process of capturing user requests for new features;
  Note: this process is now obsolete. Request processes in progress may continue until complete but new requests should use Bugzilla;
Requirement: the process of capturing user comments that may become requirements after review and approval;
  Note: this process is now obsolete. Requirement processes in progress may continue until complete but new requirements should use Bugzilla;
Review: defines the materials reviewed prior to acceptance for Zoltan release;
   Note: Developers are encouraged to use Bugzilla to enter the specific review process rather than use the Review checklist. At this time this is an trial effort and either method may be used.
Third Party Software:defines the steps required to obtain, manage, use, and test for software created outside of Zoltan and the ASC program; and
Training: defines the material a new developer must read, required skills to demonstrate and computer accounts that must be obtained.

Zoltan's software quality process checklists define how work may be performed, including process ownership, authorization to perform, activities and their sequence (when sequencing is required), process instructions, metrics, and identification of who performed each activity.

The only allowed source for process checklists is Zoltan's CVS repository in the SQA_templates directory (under Zoltan_Internal.) A Zoltan developer initiates a process by obtaining the current CVS version of the process, renaming it, and committing the renamed process checklist back into CVS in an appropriate directory on the same day. The process may continue under this committed version even if its original process is later superseded unless specifically requested by the Team Leader. After one or more activities are completed, the process checklist is updated to reflect the results and committed back to CVS (with appropriate comments.) A process is completed when all required activities are completed including reviews and approvals (as necessary), and committed to CVS. The final CVS comment should indicate that the process is complete.

Zoltan´s implementation of the ASC Software Quality Engineering Practices

The following is brief description for Zoltan developers about the Zoltan project´s implementation of the ASC Software Quality Engineering Practices (SAND 2006-5998):

PR1. Document and maintain a strategic plan.
The Zoltan web page has a direct hyperlink to the Zoltan Project Description defining its mission and philosophy. The Zoltan project has a strong association with the Trilinos project to share in the development of common software engineering practices and sharing of appropriate tools and experience.

PR2. Perform a risk-based assessment, determine level of formality and applicable practices, and obtain approvals.
The Zoltan project has an approved level of formality (medium) for its deliverable software. Its biggest technical risk results from providing parallel solutions to NP hard partitioning problems. Technical risks are mitigated by collaborations within Sandia and internationally. The most significant non-technical risk is the conflicting priorities of Zoltan developers working on many other projects simultaneously.

PR3. Document lifecycle processes and their interdependencies, and obtain approvals.
The Zoltan project follows the Trilinos Software Lifecycle Model (SAND 2006-6929). It also follows the ANSI/ASQ Z1.13-1999 standard Quality Guidelines for Research which is compatible with the research phase in the Trilinos Lifecycle model.

PR4. Define, collect, and monitor appropriate process metrics.
The Zoltan project is committed to comply fully with the new and evolving AQMC requirements for collecting and reporting "defect" metrics. Other metrics determined by Zoltan´s continual process improvement process (PR 5) will be implemented.

PR5. Periodically evaluate quality problems and implement process improvements.
The Zoltan project has built the Deming/Shewhart process improvement cycle PDCA (Plan, Do, Check, Act) into all of its process checklists. This is the most effective process improvement technique known. It is recommended by ISO 9001:2000.

PR6. Identify stakeholders and other requirements sources.
The Zoltan project´s primary stakeholders are the ASC applications using Zoltan including SIERRA, ACME, ALEGRA/NEVADA, XYCE, and Trilinos.

PR7. Gather and manage stakeholders´ expectations and requirements.
The Zoltan project´s primary input from ASC applications´ expectations and requirements are via their communication of Zoltan´s role in meeting their ASC milestones. Since Zoltan is an "enabling technology," these requirements are broadly stated performance improvement needs. The Zoltan team actively anticipates and develops load balancing software for the future needs of the Sandia research community before they actually become formal requirements.

PR8. Derive, negotiate, manage, and trace requirements.
Zoltan project requirements normally derive from its funded research proposals which state research goals. This is a normal procedure in a research environment (see ANSI/ASQ Z1.13-1999). Periodic and final reports document the success in meeting these research goals.

PR9. Identify and analyze risk events.
All Zoltan developers should report any new or changed risks via the zoltan-dev email target for evaluation by the Team Lead.

PR10. Define, monitor, and implement the risk response.
The Zoltan team will create a corrective action plan whenever any condition threatens to adversely impact the Zoltan project resources or schedule.

PR11. Create and manage the project plan.
ANSI/ASQ Z1.13-1999 states that the research proposal is equivalent to a project plan in a research environment. The Team Leader assigns responsibilities, deliverables, resources, and schedules in order to manage the project.

PR12. Track project performance versus project plan and implement needed (corrective) actions.
The Team Leader periodically tracks responsibilities, deliverables, resources, and schedules in order to manage the project.

PR13. Communicate and review design.
The Zoltan architecture is fully documented in the Zoltan Developer´s Guide. New features are originally documented and reviewed in team discussions to the zoltan-dev email target. Prior to release, the design documentation is finalized in both the Zoltan Developer´s Guide and the Zoltan User´s Guide.

PR14. Create required software and product documentation.
Developers will follow the Zoltan Development Process Checklist.

PR15. Identify and track third party software products and follow applicable agreements.
Developers will follow the Zoltan Third Party Software Process Checklist.

PR16. Identify, accept ownership, and manage the assimilation of other software products.
Not applicable since Zoltan does not "assimilate" third party software.

PR17. Perform version control of identified software product artifacts.
All software and process artifact are under maintained CVS as early as reasonable after their creation.

PR18. Record and track issues associated with the software product.
Developers will use Bugzilla to record and track issues.

PR19. Ensure backup and disaster recovery of software product artifacts.
Nightly backups, periodic offsite backups, and disaster recovery are services provided by the CSRI computer support staff. Disaster recovery has been successfully performed from real problems.

PR20. Plan and generate the release package.
Developers will follow the Zoltan Release Process Checklist.

PR21. Certify that the software product (code and its related artifacts) is ready for release and distribution.
The Zoltan Team Leader will certify any version of Zoltan for release via an email to zoltan-dev target.

PR22. Distribute release to customers.
Zoltan files are released via a download from the Zoltan web site. The Zoltan Team Leader will make the download available after certification. (Research versions of the Zoltan software are directly available to collaborators for development.)

PR23. Define and implement a customer support plan.
(See PR 6 for a list of ASC stakeholders.) The Zoltan team provides one-on-one training whenever requested and quickly responds to any user complaint.

PR24. Implement the training identified in the customer support plan.
See PR 23 above. If additional training is ever requested, the Zoltan project will piggy back on the annual Trilinos Users Group meeting with a training session on using Zoltan.

PR25. Evaluate customer feedback to determine customer satisfaction.

PR 26 Develop and maintain a software verification plan.
Developers are expected to create new tests for the Zoltan test suite when new features are added to Zoltan.

Currently, a new test framework based on FAST/EXACT is being implemented. Documentation about this test framework is under preparation. A process checklist will be developed around the steps required to add new tests to the suite and to run the suite.

PR27. Conduct tests to demonstrate that acceptance criteria are met and to ensure that previously tested capabilities continue to perform as expected.
This practice is a subset of the Zoltan Release Process Checklist.

PR28. Conduct independent technical reviews to evaluate adequacy with respect to requirements.
Developers will follow the Zoltan Review Process Checklist. ANSI/ASQ Z1.13-1999 states that the peer reviewed publications and conference presentations are a normal form of technical review in the research environment.

PR29. Determine project team training needed to fulfill assigned roles and responsibilities.
New developers will follow the Zoltan Training Process for new team members.

PR30. Track training undertaken by project teams.
Zoltan developers are encouraged to participate in the annual Trilios Users Group (TUG) meeting which provides sessions for SQA/SQE training to developers. Attendance records are kept for this event and for any Zoltan team meetings that provide training. Sandia provides many other opportunities for training including formal courses and periodic internal software developers conferences. External conferences (e.g., IPDPS and SIAM) are counted as technical training.

[Table of Contents | Next: Zoltan Distribution | Previous: Coding Principles in Zoltan | Privacy and Security]