SeqQuest Programming Style Manual

By: Peter A. Schultz < paschul@sandia.gov >
Sandia National Laboratories
Albuquerque, NM 87185


SeqQuest
Home

Mission statement

The purpose of this Style Manual is to enforce clarity and consistency in coding style for SeqQuest development. Clear code is easier to develop further - rebugging (to add features), and debugging (to find and fix the bugs that inevitably arise in an actively developed and used code) - and a consistency of style is a vital aspect of code clarity. SeqQuest may have many conventions that are idiosyncratic, code that is not optimally efficient, and coding practices that may not reflect the latest thought in computer science. It is, however, very consistent in its application of its idiosyncratic conventions, it is highly portable, scales well, and, key for the future viability of the code, is easily read, understood, and modified. SeqQuest has its own distinctive style, that will be different from any other code, as all other codes will be different from each other. Any changes to SeqQuest should respect the coding conventions that already exist, so that a future developer will continue to have a code as consistent, accessible, and understandable as the current version. Mimic the patterns you see in the code, to the extent possible. This manual will codify many of these conventions, and describe the basic rules to be followed in development of the current version of SeqQuest.

Outline

Overview

SeqQuest is a code to do electronic structure calculations within the density functional approximation and pseudopotentials, using contracted Gaussian basis sets. SeqQuest is written in Fortran 77, and a very vanilla, portable f77. SeqQuest has a very "flat" structure, with a single main program that does very little computation itself. The main program manages memory and program flow, and calls a sequence of (shallow) subroutines to do the computational work. Data is communicated via passed argument lists, not common blocks. Those familiar with BLAS/LAPACK libraries will recognize the style. There is one large central workspace in the main program, wk(maxwkd), from which all significant storage is taken, and maxwkd is set in the parameter file. There is one main input file, and one main user output file, and a multitude of larger temporary binary files the program uses to do its work.

The Laws

  1. Consistency and clarity of code are paramount. Only on very rare occasions can clarity be subordinated to other considerations. You will not see those occasions.
  2. Conform to the existing style style in the code, not your own, not the latest fad in computer science. Consistency is the key to the long-term viability of any code. Mixing styles is fatal.
  3. Clean up any new code. Changes are NOT done when they work, they are done when they work AND they are clean.
  4. The language is FORTRAN 77, and a very vanilla f77.
  5. Variable declarations will be consistent with IMPLICIT DOUBLE PRECISION (a-h,o-z). Apologies to all the strong typing types, but that's the rules. All integer variable names will begin with i-n, and no variables beginning with i-n will be reals. All real variable names will begin with a-h,o-z), and no variables beginning with a-h,o-z will be integers. Use of IMPLICIT NONE is permitted/encouraged in new routines, but all variable types will conform to the above naming conventions.
  6. NO include statements. SeqQuest has only one include statement, in the main program to bring in dimension parameters, and there will be no other include statements anywhere in the code.
  7. NO creation of O(N**2) arrays anywhere in the program, where O(N) parameters are: natmd, norbd, nkd, nlatd. Space for all such large arrays will be taken from central workspace in main program, an array called wk().
  8. Original declaration of O(N) arrays occurs in main program only. None of dimensioning parameters in the parameter file may be used to create arrays in any subroutine, i.e., all O(N) arrays used in subroutines must be passed in from main program.
  9. NO new common blocks. Data is to be communicated to subroutines via passed arguments.
  10. NO file unit numbers done by explicit integers: all file access shall be via integer variables. E.g. "write(6,fmt)" must be replaced by "write(IWR,fmt)", where IWR is an integer variable that has been set to refer to this file.
  11. NO direct use of Fortran OPEN or CLOSE. All file manipulation will be done using FLxxxx routines (that manage unit numbers).
  12. NO passed real constants. E.g., "call FOO(1.0)" should be written as "call FOO( one )" where "one" is a real variable set earlier in the code. Passing integers is ok.
  13. NO statement functions, beyond the few that already exist.
  14. Comments only on lines beginning with "c", not "!" or other funky characters. White space should begin with "c", i.e. no completely blank lines within a routine.
  15. NO tabs; indenting is done using spaces.
  16. Test thoroughly.
  17. Only those who write the rules can change the rules.

The Basics

Many of the conventions described in this section will be apparent from inspecting the code itself. Hence, the best approach is, to the extent possible, use existing code as a template/model for new development. This section will highlight some of the less obvious conventions, and reiterate the more important ones. Details will be expanded on in the next section.

Conventions

Units outline

The program does its internal work in:

Main program outline

The main program concentrates on memory management and flow control. It calls a sequence of subroutines, controlled by various flags, and manages the memory required by the large arrays used by SeqQuest. The main routine is special, and looks very different from any subroutine.

Memory outline

The principal limiting factor in SeqQuest is usually the amount of memory needed to run a problem, rather than the amount of time it takes to run a problem. I.e., SeqQuest is more memory-bound than cpu-bound. Hence, the use of memory is very tightly controlled in SeqQuest.

All large-scale memory is taken from a single large workspace in wk(maxwkd), where maxwkd is a parameter set in the main parameter file, using pointer-like integers. The routine WKMEM is the most important routine in the code: it partitions memory within wk(), and checks for memory sufficiency FOR THE ENTIRE CODE. Hence, WKMEM should be consulted before attempting any use of space within wk().

The wk() array is sectioned into pieces at the beginning of the code by pointer-like integer variables i01-i12, with spacing dictated by the size of big arrays. The first four, i01-i04, reserve enough space for either orbital matrices (nmat=nk*norb**2) or grid fields (nptr=n1r*n2r*n3r). (NB: this assumption is subject to change, check WKMEM for latest). The last eight spaces, i05-i12, only guarantee enough space for grid fields. Each of these spaces may be used as temporary space, and the intermittent comments "MEM" enumerate the contents of all the active "pointers" at that point in the code.

File I/O outline

The user listing output file should be self-documenting. It is not just the source code, but the output file the user sees which needs to have documentation in order to be readable and understandable by humans.

I/O to all files is carefully structured to try and reduce the size of files, and to make data more easily accessible to the program without churning up disk. The FLxxxx routines have been set up to manage files. They allocate available (free) unit numbers (unit numbers are not to be hard-coded!), connect to correct directory structure (as needed), and complete the file names.

Variable names outline

Consistent naming conventions are important to being able to follow code easily. The style used in SeqQuest is rather old-fashioned fortran: short, but descriptive names, with few underscores and using lower case (very specific exceptions are all upper case). The virtue is that it leads to code, that while quite dense, is rather readable. In general, single character or double character variable names are used only as very local temporary variables (notable exception is that "i" and "j" are used frequently to index basis functions in the big orbital matrices), such as the loop index in a very short do-loop. Keep names short, to preserve code-space on a fortran 72-character line, and to conform to naming conventions used in other routines.

Upper/lower case outline

Subroutines/function calls outline

Routine structure outline

Subroutine and function source code is very highly structured. Use an existing routine as a template if building a new routine, as it is easier to ensure conformance to style that way than to build on the basis of what will be necessarily incomplete instructions about style. The idea is to make all the routines look as similar to one another as possible, and, therefore, by inspection, notice bugs because patterns are violated. Deviation from any pattern should be done only with compelling reason, and should be documented.

Routine families outline

Certain sets of subroutines have similar internal structure, and code is written to emphasize the similarities where possible. Two major distinct families exist. First, the analytic "two-center" (SIJ=overlap, TIJ=kinetic, FRC2CTR=forces) and "three-center" routines (VLOCMAT, VLOCMII, VLOCFRC, NLOCMAT, NLOCFRC) for local and non-local integrals share much internal structure in common. For example, label numbers are used to emphasize the similarities between routines rather than to have strict numerical ordering within routines. Second, the grid matrix element routines (GRDOVLP, GRIDRHO, VSLOMAT, VSLOFRC, ESLOFRC) share much common internal structure. There are smaller sets of similar routines. If developing a new routine, try to follow model of existing routine in a family.

Loops outline

Looping should be done through do-loops, rather than do-while or if/goto constructs or other code construct.

Labels outline

Use of labels is highly conventional, and there are some special numbers to respect. Code clarity is the goal. "Big numbers" such as 1000,2000 or 100,200, etc should be use to denote "important" branch points or do-loops, with smaller increments for less important loops/branch points. Use labels to highlight code, not simply to denote sequence.

Indenting outline

Consistent indenting leads to more readable code. Everyone has their own conventions; the convention for SeqQuest is as follows:

Space outline

Use of space in source code is highly conventional, and is designed for ease in reading code. While not strictly obeyed, the conventions are followed rather closely barring some compelling reason not to. The following lists common cases, but as always, it is better to inspect existing code and conform to conventions seen in the code. Return to Top
Send questions and comments to: Peter Schultz at paschul@sandia.gov
Last updated: December 16, 2011