Method Commands

Method Commands Table of Contents

Method Description

The method section in a DAKOTA input file specifies the name and controls of an iterator. The terms "method" and "iterator" can be used interchangeably, although method often refers to an input specification whereas iterator usually refers to an object within the Iterator hierarchy. A method specification, then, is used to select an iterator from the iterator hierarchy, which includes optimization, uncertainty quantification, least squares, design of experiments, and parameter study iterators (see the Users Manual for more information on these iterator branches). This iterator may be used alone or in combination with other iterators as dictated by the strategy specification (refer to Strategy Commands for strategy command syntax and to the Users Manual for strategy algorithm descriptions).

Several examples follow. The first example shows a minimal specification for an optimization method.

method,
	dot_sqp
This example uses all of the defaults for this method.

A more sophisticated example would be

method,
	id_method = 'NLP1'
	model_pointer = 'M1'
	dot_sqp
	  max_iterations = 50
	  convergence_tolerance = 1e-4
	  output verbose
	  optimization_type minimize
This example demonstrates the use of identifiers and pointers (see Method Independent Controls) as well as some method independent and method dependent controls for the sequential quadratic programming (SQP) algorithm from the DOT library. The max_iterations, convergence_tolerance, and output settings are method independent controls, in that they are defined for a variety of methods (see DOT method independent controls for DOT usage of these controls). The optimization_type control is a method dependent control, in that it is only meaningful for DOT methods (see DOT method dependent controls).

The next example shows a specification for a least squares method.

method,
	optpp_g_newton
	  max_iterations = 10
	  convergence_tolerance = 1.e-8
	  search_method trust_region
	  gradient_tolerance = 1.e-6
Some of the same method independent controls are present along with a new set of method dependent controls (search_method and gradient_tolerance) which are only meaningful for OPT++ methods (see OPT++ method dependent controls).

The next example shows a specification for a nondeterministic method with several method dependent controls (refer to Nondeterministic sampling method).

method,
	nond_sampling
	  samples = 100	seed = 12345
	  sample_type lhs
	  response_levels = 1000. 500.

The last example shows a specification for a parameter study method where, again, each of the controls are method dependent (refer to Vector parameter study).

method,
	vector_parameter_study
	  step_vector = 1. 1. 1.
	  num_steps = 10

Method Specification

As alluded to in the examples above, the method specification has the following structure:
method,
	<method independent controls>
	<method selection>
	  <method dependent controls>

where <method selection> is one of the following: dot_frcg, dot_mmfd, dot_bfgs, dot_slp, dot_sqp, conmin_frcg, conmin_mfd, npsol_sqp, nlssol_sqp, nlpql_sqp, nl2sol, reduced_sqp, optpp_cg, optpp_q_newton, optpp_fd_newton, optpp_g_newton, optpp_newton, optpp_pds, asynch_pattern_search, coliny_cobyla, coliny_direct, coliny_pattern_search, coliny_solis_wets, coliny_ea, moga, soga, nond_polynomial_chaos, nond_sampling, nond_local_reliability, nond_global_reliability, nond_evidence, dace, fsu_quasi_mc, fsu_cvt, vector_parameter_study, list_parameter_study, centered_parameter_study, or multidim_parameter_study.

The <method independent controls> are those controls which are valid for a variety of methods. In some cases, these controls are abstractions which may have slightly different implementations from one method to the next. The <method dependent controls> are those controls which are only meaningful for a specific method or library. Referring to dakota.input.txt, the method independent controls are those controls defined externally from and prior to the method selection blocks. They are all optional. The method selection blocks are all required group specifications separated by logical OR's. The method dependent controls are those controls defined within the method selection blocks. Defaults for method independent and method dependent controls are defined in DataMethod. The following sections provide additional detail on the method independent controls followed by the method selections and their corresponding method dependent controls.

Method Independent Controls

The method independent controls include a method identifier string, a model type specification with pointers to variables, interface, and responses specifications, a speculative gradient selection, an output verbosity control, maximum iteration and function evaluation limits, constraint and convergence tolerance specifications, a scaling selection, and a set of linear inequality and equality constraint specifications. While each of these controls is not valid for every method, the controls are valid for enough methods that it was reasonable to pull them out of the method dependent blocks and consolidate the specifications.

The method identifier string is supplied with id_method and is used to provide a unique identifier string for use with strategy specifications (refer to Strategy Description). It is appropriate to omit a method identifier string if only one method is included in the input file and single_method is the selected strategy (all other strategies require one or more method pointers), since the single method to use is unambiguous in this case.

The model pointer string is specified with model_pointer and is used to identify the model used to perform function evaluations for the method. If a model pointer string is specified and no corresponding id is available, DAKOTA will exit with an error message. If no model pointer string is specified, then the last model specification parsed will be used. If no model pointer string is specified and no model specification is provided by the user, then a default model specification is used (similar to the default strategy specification, see Strategy Description). This default model specification is of type single with no variables_pointer, interface_pointer, or responses_pointer (see Single Model Controls). It is appropriate to omit a model specification whenever the relationships are unambiguous due to the presence of single variables, interface, and responses specifications.

When performing gradient-based optimization in parallel, speculative gradients can be selected to address the load imbalance that can occur between gradient evaluation and line search phases. In a typical gradient-based optimization, the line search phase consists primarily of evaluating the objective function and any constraints at a trial point, and then testing the trial point for a sufficient decrease in the objective function value and/or constraint violation. If a sufficient decrease is not observed, then one or more additional trial points may be attempted sequentially. However, if the trial point is accepted then the line search phase is complete and the gradient evaluation phase begins. By speculating that the gradient information associated with a given line search trial point will be used later, additional coarse grained parallelism can be introduced by computing the gradient information (either by finite difference or analytically) in parallel, at the same time as the line search phase trial-point function values. This balances the total amount of computation to be performed at each design point and allows for efficient utilization of multiple processors. While the total amount of work performed will generally increase (since some speculative gradients will not be used when a trial point is rejected in the line search phase), the run time will usually decrease (since gradient evaluations needed at the start of each new optimization cycle were already performed in parallel during the line search phase). Refer to [Byrd et al., 1998] for additional details. The speculative specification is implemented for the gradient-based optimizers in the DOT, CONMIN, and OPT++ libraries, and it can be used with dakota numerical or analytic gradient selections in the responses specification (refer to Gradient Specification for information on these specifications). It should not be selected with vendor numerical gradients since vendor internal finite difference algorithms have not been modified for this purpose. In full-Newton approaches, the Hessian is also computed speculatively. NPSOL and NLSSOL do not support speculative gradients, as their gradient-based line search in user-supplied gradient mode (dakota numerical or analytic gradients) is a superior approach for load-balanced parallel execution.

Output verbosity control is specified with output followed by silent, quiet, verbose or debug. If there is no user specification for output verbosity, then the default setting is normal. This gives a total of five output levels to manage the volume of data that is returned to the user during the course of a study, ranging from full run annotation plus internal debug diagnostics (debug) to the bare minimum of output containing little more than the total number of simulations performed and the final solution (silent). Output verbosity is observed within the Iterator (algorithm verbosity), Model (synchronize/fd_gradients verbosity), Interface (map/synch verbosity), Approximation (global data fit coefficient reporting),and AnalysisCode (file operation reporting) class hierarchies; however, not all of these software components observe the full granularity of verbosity settings. Specific mappings are as follows:

Note that iterators and interfaces utilize the full granularity in verbosity, whereas models, approximations, and file operations do not. With respect to iterator verbosity, different iterators implement this control in slightly different ways (as described below in the method independent controls descriptions for each iterator), however the meaning is consistent. For models, interfaces, approximations, and file operations, quiet suppresses parameter and response set reporting and silent further suppresses function evaluation headers and scheduling output. Similarly, verbose adds file management, approximation evaluation, and global approximation coefficient details, and debug further adds diagnostics from nonblocking schedulers.

The constraint_tolerance specification determines the maximum allowable value of infeasibility that any constraint in an optimization problem may possess and still be considered to be satisfied. It is specified as a positive real value. If a constraint function is greater than this value then it is considered to be violated by the optimization algorithm. This specification gives some control over how tightly the constraints will be satisfied at convergence of the algorithm. However, if the value is set too small the algorithm may terminate with one or more constraints being violated. This specification is currently meaningful for the NPSOL, NLSSOL, DOT and CONMIN constrained optimizers (refer to DOT method independent controls and NPSOL method independent controls).

The convergence_tolerance specification provides a real value for controlling the termination of iteration. In most cases, it is a relative convergence tolerance for the objective function; i.e., if the change in the objective function between successive iterations divided by the previous objective function is less than the amount specified by convergence_tolerance, then this convergence criterion is satisfied on the current iteration. Since no progress may be made on one iteration followed by significant progress on a subsequent iteration, some libraries require that the convergence tolerance be satisfied on two or more consecutive iterations prior to termination of iteration. This control is used with optimization and least squares iterators (DOT, CONMIN, NPSOL, NLSSOL, OPT++, and Coliny) and is not used within the uncertainty quantification, design of experiments, or parameter study iterator branches. Refer to DOT method independent controls, NPSOL method independent controls, OPT++ method independent controls, and Coliny method independent controls for specific interpretations of the convergence_tolerance specification.

The max_iterations and max_function_evaluations controls provide integer limits for the maximum number of iterations and maximum number of function evaluations, respectively. The difference between an iteration and a function evaluation is that a function evaluation involves a single parameter to response mapping through an interface, whereas an iteration involves a complete cycle of computation within the iterator. Thus, an iteration generally involves multiple function evaluations (e.g., an iteration contains descent direction and line search computations in gradient-based optimization, population and multiple offset evaluations in nongradient-based optimization, etc.). This control is not currently used within the uncertainty quantification, design of experiments, and parameter study iterator branches, and in the case of optimization and least squares, does not currently capture function evaluations that occur as part of the method_source dakota finite difference routine (since these additional evaluations are intentionally isolated from the iterators).

Continuous design variable, function, and constraint scaling can be turned on for optimizers and least squares minimizers by providing the scaling keyword. Discrete variable scaling is not supported. When scaling is enabled, variables, functions, gradients, Hessians, etc., are transformed such that the optimizer iterates in scaled variable space, whereas evaluations of the computational model as specified in the interface are performed on the original problem scale. Therefore using scaling does not require rewriting the interface to the simulation code. The user may specify no, one, or a vector of scaling type strings through each of the scale_types (see Variables Commands); objective_function_scale_types, least_squares_term_scale_types, nonlinear_inequality_scale_types, nonlinear_equality_scale_types (see Function Specification); linear_inequality_scale_types, and linear_equality_scale_types (see Method Independent Controls below) specifications. Valid options for types include 'none' (default), 'value', 'auto', or 'log', for no, characteristic value, automatic, or logarithmic scaling, respectively, although not all types are valid for scaling all entities (see the references for details). If a single string is specified using any of these keywords it will apply to each component of the relevant vector, e.g., scale_types = 'value' will enable characteristic value scaling for each continuous design variable. The user may specify no, one, or a vector of nonzero characteristic scale values through each of the scales (see Variables Commands); objective_function_scales, least_squares_term_scales, nonlinear_inequality_scales, nonlinear_equality_scales (see Function Specification); linear_inequality_scales, and linear_equality_scales (see Method Independent Controls below) specifications. These values are ignored for scaling type 'none', required for 'value', and optional for 'auto' and 'log'. If a single value is specified using any of these keywords it will apply to each component of the relevant vector, e.g., scales = 3.0 will apply a characteristic scaling value of 3.0 to each continuous design variable. When the scaling keyword is omitted, all _scale_types and *_scales specifications are ignored in the method, variables, and responses sections.

When scaling is enabled, the following procedures determine the transformations used to scale each component of a variables or response vector. In all cases, if scaling would result in division by a value smaller in magnitude than 1.0e-3, a warning is issued and no scaling performed for that component.

Table 5.1 provides the specification detail for the method independent controls involving identifiers, pointers, tolerances, limits, output verbosity, speculative gradients, and scaling.

Table 5.1 Specification detail for the method independent controls: identifiers, pointers, tolerances, limits, output verbosity, speculative gradients, and scaling
Description Keyword Associated Data Status Default
Method set identifier id_method string Optional strategy use of last method parsed
Model pointer model_pointer string Optional method use of last model parsed (or use of default model if none parsed)
Speculative gradients and Hessians speculative none Optional no speculation
Output verbosity output silent | quiet | verbose | debug Optional normal
Maximum iterations max_iterations integer Optional optimization/least squares: 100, fsu_cvt: 30, nond_local_reliability: 10
Maximum function evaluations max_function_evaluations integer Optional 1000
Constraint tolerance constraint_tolerance real Optional Library default
Convergence tolerance convergence_tolerance real Optional 1.e-4
Scaling flag scaling none Optional no scaling

Linear inequality constraints can be supplied with the linear_inequality_constraint_matrix, linear_inequality_lower_bounds, and linear_inequality_upper_bounds specifications, and linear equality constraints can be supplied with the linear_equality_constraint_matrix and linear_equality_targets specifications. In the inequality case, the constraint matrix provides coefficients for the variables and the lower and upper bounds provide constraint limits for the following two-sided formulation:

\[a_l \leq Ax \leq a_u\]

As with nonlinear inequality constraints (see Objective and constraint functions (optimization data set)), the default linear inequality constraint bounds are selected so that one-sided inequalities of the form

\[Ax \leq 0.0\]

result when there are no user bounds specifications (this provides backwards compatibility with previous DAKOTA versions). In a user bounds specification, any upper bound values greater than +bigRealBoundSize (1.e+30, as defined in Minimizer) are treated as +infinity and any lower bound values less than -bigRealBoundSize are treated as -infinity. This feature is commonly used to drop one of the bounds in order to specify a 1-sided constraint (just as the default lower bounds drop out since -DBL_MAX < -bigRealBoundSize). In the equality case, the constraint matrix again provides coefficients for the variables and the targets provide the equality constraint right hand sides:

\[Ax = a_t\]

and the defaults for the equality constraint targets enforce a value of 0. for each constraint

\[Ax = 0.0\]

Currently, DOT, CONMIN, NPSOL, NLSSOL, and OPT++ all support specialized handling of linear constraints (either directly through the algorithm itself or indirectly through the DAKOTA wrapper). Coliny optimizers will support linear constraints in future releases. Linear constraints need not be computed by the user's interface on every function evaluation; rather the coefficients, bounds, and targets of the linear constraints can be provided at start up, allowing the optimizers to track the linear constraints internally. It is important to recognize that linear constraints are those constraints that are linear in the design variables, e.g.:

\[0.0 \leq 3x_1 - 4x_2 + 2x_3 \leq 15.0\]

\[x_1 + x_2 + x_3 \geq 2.0\]

\[x_1 + x_2 - x_3 = 1.0\]

which is not to be confused with something like

\[s(X) - s_{fail} \leq 0.0\]

where the constraint is linear in a response quantity, but may be a nonlinear implicit function of the design variables. For the three linear constraints above, the specification would appear as:

linear_inequality_constraint_matrix =  3.0 -4.0  2.0
                                       1.0  1.0  1.0
linear_inequality_lower_bounds =       0.0  2.0
linear_inequality_upper_bounds =      15.0  1.e+50
linear_equality_constraint_matrix =    1.0  1.0 -1.0
linear_equality_targets =              1.0
where the 1.e+50 is a dummy upper bound value which defines a 1-sided inequality since it is greater than bigRealBoundSize. The constraint matrix specifications list the coefficients of the first constraint followed by the coefficients of the second constraint, and so on. They are divided into individual constraints based on the number of design variables, and can be broken onto multiple lines for readability as shown above.

The linear_inequality_scale_types and linear_equality_scale_types specifications provide strings specifying the scaling type for each linear inequality or equality constraint, respectively, in methods that support scaling, when scaling is enabled (see Method Independent Controls for details). Each entry in linear_*_scale_types may be selected from 'none', 'value', or 'auto' to select no, characteristic value, or automatic scaling, respectively. If a single string is specified it will apply to each constraint component. Each entry in linear_inequality_scales or linear_equality_scales may be a user-specified nonzero characteristic value to be used in scaling each constraint. These values are ignored for scaling type 'none', required for 'value', and optional for 'auto'. If a single real value is specified it will apply to all components of the constraint. Scaling for linear constraints is applied after any continuous variable scaling. For example, for variable scaling on continuous design variables x:

\[ \tilde{x}^j = \frac{x^j - x^j_O}{x^j_M} \]

we have the following system for linear inequality constraints

\[ a_L \leq A_i x \leq a_U \]

\[ a_L \leq A_i \left( \mathrm{diag}(x_M) \tilde{x} + x_O \right) \leq a_U \]

\[ a_L - A_i x_O \leq A_i \mathrm{diag}(x_M) \tilde{x} \leq a_U - A_i x_O \]

\[ \tilde{a}_L \leq \tilde{A}_i \tilde{x} \leq \tilde{a}_U \]

and user-specified or automatically computed scaling multipliers are appplied to this final transformed system, which accounts for continuous design variable scaling. When automatic scaling is in use for linear constraints they are linearly scaled by a computed characteristic value, but not affinely to [0,1].

Table 5.2 provides the specification detail for the method independent controls involving linear constraints.

Table 5.2 Specification detail for the method independent controls: linear inequality and equality constraints
Description Keyword Associated Data Status Default
Linear inequality coefficient matrix linear_inequality_constraint_matrix list of reals Optional no linear inequality constraints
Linear inequality lower bounds linear_inequality_lower_bounds list of reals Optional vector values = -DBL_MAX
Linear inequality upper bounds linear_inequality_upper_bounds list of reals Optional vector values = 0.
Linear inequality scaling types linear_inequality_scale_types list of strings Optional vector values = 'none'
Linear inequality scales linear_inequality_scales list of reals Optional vector values = 1. (no scaling)
Linear equality coefficient matrix linear_equality_constraint_matrix list of reals Optional no linear equality constraints
Linear equality targets linear_equality_targets list of reals Optional vector values = 0.
Linear equality scaling types linear_equality_scale_types list of strings Optional vector values = 'none'
Linear equality scales linear_equality_scales list of reals Optional vector values = 1. (no scaling)

DOT Methods

The DOT library [Vanderplaats Research and Development, 1995] contains nonlinear programming optimizers, specifically the Broyden-Fletcher-Goldfarb-Shanno (DAKOTA's dot_bfgs method) and Fletcher-Reeves conjugate gradient (DAKOTA's dot_frcg method) methods for unconstrained optimization, and the modified method of feasible directions (DAKOTA's dot_mmfd method), sequential linear programming (DAKOTA's dot_slp method), and sequential quadratic programming (DAKOTA's dot_sqp method) methods for constrained optimization. DAKOTA provides access to the DOT library through the DOTOptimizer class.

DOT method independent controls

The method independent controls for max_iterations and max_function_evaluations limit the number of major iterations and the number of function evaluations that can be performed during a DOT optimization. The convergence_tolerance control defines the threshold value on relative change in the objective function that indicates convergence. This convergence criterion must be satisfied for two consecutive iterations before DOT will terminate. The constraint_tolerance specification defines how tightly constraint functions are to be satisfied at convergence. The default value for DOT constrained optimizers is 0.003. Extremely small values for constraint_tolerance may not be attainable. The output verbosity specification controls the amount of information generated by DOT: the silent and quiet settings result in header information, final results, and objective function, constraint, and parameter information on each iteration; whereas the verbose and debug settings add additional information on gradients, search direction, one-dimensional search results, and parameter scaling factors. DOT contains no parallel algorithms which can directly take advantage of concurrent evaluations. However, if numerical_gradients with method_source dakota is specified, then the finite difference function evaluations can be performed concurrently (using any of the parallel modes described in the Users Manual). In addition, if speculative is specified, then gradients (dakota numerical or analytic gradients) will be computed on each line search evaluation in order to balance the load and lower the total run time in parallel optimization studies. Lastly, specialized handling of linear constraints is supported with DOT; linear constraint coefficients, bounds, and targets can be provided to DOT at start-up and tracked internally. Specification detail for these method independent controls is provided in Tables 5.1 through 5.2.

DOT method dependent controls

DOT's only method dependent control is optimization_type which may be either minimize or maximize. DOT provides the only set of methods within DAKOTA which support this control; to convert a maximization problem into the minimization formulation assumed by other methods, simply change the sign on the objective function (i.e., multiply by -1). Table 5.3 provides the specification detail for the DOT methods and their method dependent controls.

Table 5.3 Specification detail for the DOT methods
Description Keyword Associated Data Status Default
Optimization type optimization_type minimize | maximize Optional group minimize

NPSOL Method

The NPSOL library [Gill et al., 1986] contains a sequential quadratic programming (SQP) implementation (the npsol_sqp method). SQP is a nonlinear programming optimizer for constrained minimization. DAKOTA provides access to the NPSOL library through the NPSOLOptimizer class.

NPSOL method independent controls

The method independent controls for max_iterations and max_function_evaluations limit the number of major SQP iterations and the number of function evaluations that can be performed during an NPSOL optimization. The convergence_tolerance control defines NPSOL's internal optimality tolerance which is used in evaluating if an iterate satisfies the first-order Kuhn-Tucker conditions for a minimum. The magnitude of convergence_tolerance approximately specifies the number of significant digits of accuracy desired in the final objective function (e.g., convergence_tolerance = 1.e-6 will result in approximately six digits of accuracy in the final objective function). The constraint_tolerance control defines how tightly the constraint functions are satisfied at convergence. The default value is dependent upon the machine precision of the platform in use, but is typically on the order of 1.e-8 for double precision computations. Extremely small values for constraint_tolerance may not be attainable. The output verbosity setting controls the amount of information generated at each major SQP iteration: the silent and quiet settings result in only one line of diagnostic output for each major iteration and print the final optimization solution, whereas the verbose and debug settings add additional information on the objective function, constraints, and variables at each major iteration.

NPSOL is not a parallel algorithm and cannot directly take advantage of concurrent evaluations. However, if numerical_gradients with method_source dakota is specified, then the finite difference function evaluations can be performed concurrently (using any of the parallel modes described in the Users Manual). An important related observation is the fact that NPSOL uses two different line searches depending on how gradients are computed. For either analytic_gradients or numerical_gradients with method_source dakota, NPSOL is placed in user-supplied gradient mode (NPSOL's "Derivative Level" is set to 3) and it uses a gradient-based line search (the assumption is that user-supplied gradients are inexpensive). On the other hand, if numerical_gradients are selected with method_source vendor, then NPSOL is computing finite differences internally and it will use a value-based line search (the assumption is that finite differencing on each line search evaluation is too expensive). The ramifications of this are: (1) performance will vary between method_source dakota and method_source vendor for numerical_gradients, and (2) gradient speculation is unnecessary when performing optimization in parallel since the gradient-based line search in user-supplied gradient mode is already load balanced for parallel execution. Therefore, a speculative specification will be ignored by NPSOL, and optimization with numerical gradients should select method_source dakota for load balanced parallel operation and method_source vendor for efficient serial operation.

Lastly, NPSOL supports specialized handling of linear inequality and equality constraints. By specifying the coefficients and bounds of the linear inequality constraints and the coefficients and targets of the linear equality constraints, this information can be provided to NPSOL at initialization and tracked internally, removing the need for the user to provide the values of the linear constraints on every function evaluation. Refer to Method Independent Controls for additional information and to Tables 5.1 through 5.2 for method independent control specification detail.

NPSOL method dependent controls

NPSOL's method dependent controls are verify_level, function_precision, and linesearch_tolerance. The verify_level control instructs NPSOL to perform finite difference verifications on user-supplied gradient components. The function_precision control provides NPSOL an estimate of the accuracy to which the problem functions can be computed. This is used to prevent NPSOL from trying to distinguish between function values that differ by less than the inherent error in the calculation. And the linesearch_tolerance setting controls the accuracy of the line search. The smaller the value (between 0 and 1), the more accurately NPSOL will attempt to compute a precise minimum along the search direction. Table 5.4 provides the specification detail for the NPSOL SQP method and its method dependent controls.

Table 5.4 Specification detail for the NPSOL SQP method
Description Keyword Associated Data Status Default
Gradient verification level verify_level integer Optional -1 (no gradient verification)
Function precision function_precision real Optional 1.e-10
Line search tolerance linesearch_tolerance real Optional 0.9 (inaccurate line search)

NLPQL Methods

The NLPQL library is a commercially-licensed library containing a sequential quadratic programming (SQP) optimizer, specified as DAKOTA's nlpql_sqp method, for constrained optimization. The particular implementation used is NLPQLP [Schittkowski, 2004], a variant with distributed and non-monotone line search. DAKOTA provides access to the NLPQL library through the NLPQLPOptimizer class.

NLPQL method independent controls

The method independent controls for maximum iterations and output verbosity are mapped to NLPQL controls MAXIT and IPRINT, respectively. The maximum number of function evaluations is enforced within the NLPQLPOptimizer class.

NLPQL method dependent controls

NLPQL does not currently support any method dependent controls.

CONMIN Methods

The CONMIN library [Vanderplaats, 1973] is a public domain library of nonlinear programming optimizers, specifically the Fletcher-Reeves conjugate gradient (DAKOTA's conmin_frcg method) method for unconstrained optimization, and the method of feasible directions (DAKOTA's conmin_mfd method) for constrained optimization. As CONMIN was a predecessor to the DOT commercial library, the algorithm controls are very similar. DAKOTA provides access to the CONMIN library through the CONMINOptimizer class.

CONMIN method independent controls

The interpretations of the method independent controls for CONMIN are essentially identical to those for DOT. Therefore, the discussion in DOT method independent controls is relevant for CONMIN.

CONMIN method dependent controls

CONMIN does not currently support any method dependent controls.

OPT++ Methods

The OPT++ library [Meza et al., 2007] contains primarily gradient-based nonlinear programming optimizers for unconstrained, bound-constrained, and nonlinearly constrained minimization: Polak-Ribiere conjugate gradient (DAKOTA's optpp_cg method), quasi-Newton (DAKOTA's optpp_q_newton method), finite difference Newton (DAKOTA's optpp_fd_newton method), and full Newton (DAKOTA's optpp_newton method). The conjugate gradient method is strictly unconstrained, and each of the Newton-based methods are automatically bound to the appropriate OPT++ algorithm based on the user constraint specification (unconstrained, bound-constrained, or generally-constrained). In the generally-constrained case, the Newton methods use a nonlinear interior-point approach to manage the constraints. The library also contains a direct search algorithm, PDS (parallel direct search, DAKOTA's optpp_pds method), which supports bound constraints. DAKOTA provides access to the OPT++ library through the SNLLOptimizer class, where "SNLL" denotes Sandia National Laboratories - Livermore.

OPT++ method independent controls

The method independent controls for max_iterations and max_function_evaluations limit the number of major iterations and the number of function evaluations that can be performed during an OPT++ optimization. The convergence_tolerance control defines the threshold value on relative change in the objective function that indicates convergence. The output verbosity specification controls the amount of information generated from OPT++ executions: the debug setting turns on OPT++'s internal debug mode and also generates additional debugging information from DAKOTA's SNLLOptimizer wrapper class. OPT++'s gradient-based methods are not parallel algorithms and cannot directly take advantage of concurrent function evaluations. However, if numerical_gradients with method_source dakota is specified, a parallel DAKOTA configuration can utilize concurrent evaluations for the finite difference gradient computations. OPT++'s nongradient-based PDS method can directly exploit asynchronous evaluations; however, this capability has not yet been implemented in the SNLLOptimizer class.

The speculative specification enables speculative computation of gradient and/or Hessian information, where applicable, for parallel optimization studies. By speculating that the derivative information at the current point will be used later, the complete data set (all available gradient/Hessian information) can be computed on every function evaluation. While some of these computations will be wasted, the positive effects are a consistent parallel load balance and usually shorter wall clock time. The speculative specification is applicable only when parallelism in the gradient calculations can be exploited by DAKOTA (it will be ignored for vendor numerical gradients).

Lastly, linear constraint specifications are supported by each of the Newton methods (optpp_newton, optpp_q_newton, optpp_fd_newton, and optpp_g_newton); whereas optpp_cg must be unconstrained and optpp_pds can be, at most, bound-constrained. Specification detail for the method independent controls is provided in Tables 5.1 through 5.2.

OPT++ method dependent controls

OPT++'s method dependent controls are max_step, gradient_tolerance, search_method, merit_function, central_path, steplength_to_boundary, centering_parameter, and search_scheme_size. The max_step control specifies the maximum step that can be taken when computing a change in the current design point (e.g., limiting the Newton step computed from current gradient and Hessian information). It is equivalent to a move limit or a maximum trust region size. The gradient_tolerance control defines the threshold value on the L2 norm of the objective function gradient that indicates convergence to an unconstrained minimum (no active constraints). The gradient_tolerance control is defined for all gradient-based optimizers.

max_step and gradient_tolerance are the only method dependent controls for the OPT++ conjugate gradient method. Table 5.5 covers this specification.

Table 5.5 Specification detail for the OPT++ conjugate gradient method
Description Keyword Associated Data Status Default
OPT++ conjugate gradient method optpp_cg none Required N/A
Maximum step size max_step real Optional 1000.
Gradient tolerance gradient_tolerance real Optional 1.e-4

The search_method control is defined for all Newton-based optimizers and is used to select between trust_region, gradient_based_line_search, and value_based_line_search methods. The gradient_based_line_search option uses the line search method proposed by [More and Thuente, 1994]. This option satisfies sufficient decrease and curvature conditions; whereas, value_base_line_search only satisfies the sufficient decrease condition. At each line search iteration, the gradient_based_line_search method computes the function and gradient at the trial point. Consequently, given expensive function evaluations, the value_based_line_search method is preferred to the gradient_based_line_search method. Each of these Newton methods additionally supports the tr_pds selection for unconstrained problems. This option performs a robust trust region search using pattern search techniques. Use of a line search is the default for bound-constrained and generally-constrained problems, and use of a trust_region search method is the default for unconstrained problems.

The merit_function, central_path, steplength_to_boundary, and centering_parameter selections are additional specifications that are defined for the solution of generally-constrained problems with nonlinear interior-point algorithms. A merit_function is a function in constrained optimization that attempts to provide joint progress toward reducing the objective function and satisfying the constraints. Valid string inputs are "el_bakry", "argaez_tapia", or "van_shanno", where user input is not case sensitive in this case. Details for these selections are as follows:

If the function evaluation is expensive or noisy, set the merit_function to "argaez_tapia" or "van_shanno".

The central_path specification represents a measure of proximity to the central path and specifies an update strategy for the perturbation parameter mu. Refer to [Argaez et al., 2002] for a detailed discussion on proximity measures to the central region. Valid options are, again, "el_bakry", "argaez_tapia", or "van_shanno", where user input is not case sensitive. The default value for central_path is the value of merit_function (either user-selected or default). The steplength_to_boundary specification is a parameter (between 0 and 1) that controls how close to the boundary of the feasible region the algorithm is allowed to move. A value of 1 means that the algorithm is allowed to take steps that may reach the boundary of the feasible region. If the user wishes to maintain strict feasibility of the design parameters this value should be less than 1. Default values are .8, .99995, and .95 for the "el_bakry", "argaez_tapia", and "van_shanno" merit functions, respectively. The centering_parameter specification is a parameter (between 0 and 1) that controls how closely the algorithm should follow the "central path". See [Wright] for the definition of central path. The larger the value, the more closely the algorithm follows the central path, which results in small steps. A value of 0 indicates that the algorithm will take a pure Newton step. Default values are .2, .2, and .1 for the "el_bakry", "argaez_tapia", and "van_shanno" merit functions, respectively.

Table 5.6 provides the details for the Newton-based methods.

Table 5.6 Specification detail for OPT++ Newton-based optimization methods
Description Keyword Associated Data Status Default
OPT++ Newton-based methods optpp_q_newton | optpp_fd_newton | optpp_newton none Required group N/A
Search method value_based_line_search | gradient_based_line_search | trust_region | tr_pds none Optional group trust_region (unconstrained), value_based_line_search (bound/general constraints)
Maximum step size max_step real Optional 1000.
Gradient tolerance gradient_tolerance real Optional 1.e-4
Merit function merit_function string Optional "argaez_tapia"
Central path central_path string Optional value of merit_function
Steplength to boundary steplength_to_boundary real Optional Merit function dependent: 0.8 (el_bakry), 0.99995 (argaez_tapia), 0.95 (van_shanno)
Centering parameter centering_parameter real Optional Merit function dependent: 0.2 (el_bakry), 0.2 (argaez_tapia), 0.1 (van_shanno)

The search_scheme_size is defined for the PDS method to specify the number of points to be used in the direct search template. PDS does not support parallelism at this time due to current limitations in the OPT++ interface. Table 5.7 provides the detail for the parallel direct search method.

Table 5.7 Specification detail for the OPT++ PDS method
Description Keyword Associated Data Status Default
OPT++ parallel direct search method optpp_pds none Required group N/A
Search scheme size search_scheme_size integer Optional 32

Asynchronous Parallel Pattern Search (APPS)

The asynchronous parallel pattern search algorithm [Gray and Kolda, 2006] is a fully asynchronous pattern search technique in that the search along each offset direction continues without waiting for searches along other directions to finish. APPSPACK can handle unconstrained problems as well as those with bound constraints, linear constraints, and general nonlinear constraints. APPSPACK is available to the public under the GNU LGPL and the source code is included with DAKOTA. APPS-specific software documentation is available from http://software.sandia.gov/appspack.

APPS method independent controls

The only method independent controls that are currently mapped to APPS are max_function_evaluations, constraint_tolerance, and the output verbosity control. The APPS internal "debug" level is mapped to the DAKOTA debug, verbose, normal, quiet, and silent settings as follows:

APPS method dependent controls

The APPS method is invoked using a asynch_pattern_search group specification. Some of the method dependent controls are similar to the Coliny controls for coliny_pattern_search described in Pattern Search. In particular, APPS supports the following step length control parameters

When the solution to the optimization problem is known to be zero, the user may specify a value for solution_target as a termination criteria. APPS will terminate when the function value falls below solution_target.

Currently, APPS only supports coordinate bases with a total of 2n function evaluations in the pattern, and these patterns may only contract. The synchronization specification can be used to specify the use of either blocking or nonblocking schedulers for APPS. The blocking option causes APPS to behave as a synchronous algorithm. The nonblocking option is not available when Dakota is used in message-passing mode.

APPS solves nonlinearly constrained problems by solving a sequence of linearly constrained merit function-base subproblems. There are several exact and smoothed exact penalty functions that can be specified with the merit_function control. The options are as follows:

The user can also specify a constraint_penalty and smoothing_parameter. Table 5.8 summarizes the APPS specification.

Table 5.8 Specification detail for the APPS method
Description Keyword Associated Data Status Default
APPS method asynch_pattern_search none Required group N/A
Initial offset value initial_delta real Optional 1.0
Threshold for offset values threshold_delta real Optional 0.01
Pattern contraction factor contraction_factor real Optional 0.5
Solution target solution_target real Optional not used
Evaluation synchronization synchronization blocking | nonblocking Optional nonblocking
Merit function merit_function merit_max | merit_max_smooth | merit1 | merit1_smooth | merit2 | merit2_smooth | merit2_squared Optional merit2_smooth
Constraint penalty constraint_penalty real Optional 1.0
Smoothing factor smoothing_factor real Optional 1.0

Coliny Methods

Coliny is a collection of nongradient-based optimizers that support the Common Optimization Library INterface (COLIN). Coliny optimizers currently include coliny_cobyla, coliny_direct, coliny_ea, coliny_pattern_search and coliny_solis_wets. Additional Coliny information is available from http://software.sandia.gov/Acro/Coliny/.

Coliny solvers now support bound constraints and general nonlinear constraints. Supported nonlinear constraints include both equality and two-sided inequality constraints. Coliny solvers do not yet support linear constraints. Most Coliny optimizers treat constraints with a simple penalty scheme that adds constraint_penalty times the sum of squares of the constraint violations to the objective function. Specific exceptions to this method for handling constraint violations are noted below. (The default value of constraint_penalty is 1000.0, except for methods that dynamically adapt their constraint penalty, for which the default value is 1.0.)

Coliny method independent controls

The method independent controls for max_iterations and max_function_evaluations limit the number of major iterations and the number of function evaluations that can be performed during a Coliny optimization, respectively. The convergence_tolerance control defines the threshold value on relative change in the objective function that indicates convergence. The output verbosity specification controls the amount of information generated by Coliny: the silent, quiet, and normal settings correspond to minimal reporting from Coliny, whereas the verbose setting corresponds to a higher level of information, and debug outputs method initialization and a variety of internal Coliny diagnostics. The majority of Coliny's methods perform independent function evaluations that can directly take advantage of DAKOTA's parallel capabilities. Only coliny_solis_wets, coliny_cobyla, and certain configurations of coliny_pattern_search are inherently serial (see Pattern Search). The parallel methods automatically utilize parallel logic when the DAKOTA configuration supports parallelism. Lastly, neither speculative gradients nor linear constraints are currently supported with Coliny. Specification detail for method independent controls is provided in Tables 5.1 through 5.2.

Coliny method dependent controls

All Coliny methods support the show_misc_options optional specification which results in a dump of all the allowable method inputs. Note that the information provided by this command refers to optimizer parameters that are internal to Coliny, and which may differ from corresponding parameters used by the DAKOTA interface. The misc_options optional specification provides a means for inputing additional settings supported by the Coliny methods but which are not currently mapped through the DAKOTA input specification. Care must be taken in using this specification; they should only be employed by users familiar with the full range of parameter specifications available directly from Coliny and understand any differences that exist between those specifications and the ones available through DAKOTA.

Each of the Coliny methods supports the solution_accuracy control, which defines a convergence criterion in which the optimizer will terminate if it finds an objective function value lower than the specified accuracy. Specification detail for method dependent controls for all Coliny methods is provided in Table 5.9.

Table 5.9 Specification detail for Coliny method dependent controls
Description Keyword Associated Data Status Default
Show miscellaneous options show_misc_options none Optional no dump of specification options
Specify miscellaneous options misc_options list of strings Optional no miscellaneous options specified
Desired solution accuracy solution_accuracy real Optional -DBL_MAX

Each Coliny method supplements the settings of Table 5.9 with controls which are specific to its particular class of method.

COBYLA

The Constrained Optimization BY Linear Approximations (COBYLA) algorithm is an extension to the Nelder-Mead simplex algorithm for handling general linear/nonlinear constraints and is invoked using the coliny_cobyla group specification. The COBYLA algorithm employs linear approximations to the objective and constraint functions, the approximations being formed by linear interpolation at N+1 points in the space of the variables. We regard these interpolation points as vertices of a simplex. The step length parameter controls the size of the simplex and it is reduced automatically from initial_delta to threshold_delta. One advantage that COBYLA has over many of its competitors is that it treats each constraint individually when calculating a change to the variables, instead of lumping the constraints together into a single penalty function.

COBYLA currently only supports termination based on the max_function_evaluations and solution_accuracy specifications. The search performed by COBYLA is currently not parallelized.

Table 5.10 summarizes the COBYLA specification.

Table 5.10 Specification detail for the COBYLA method
Description Keyword Associated Data Status Default
COBYLA method coliny_cobyla none Required group N/A
Initial offset value initial_delta real Required N/A
Threshold for offset values threshold_delta real Required N/A

DIRECT

The DIviding RECTangles (DIRECT) optimization algorithm is a derivative free global optimization method that balances local search in promising regions of the design space with global search in unexplored regions. As shown in Figure 5.1, DIRECT adaptively subdivides the space of feasible design points so as to guarantee that iterates are generated in the neighborhood of a global minimum in finitely many iterations.

direct1.jpg

Figure 5.1 Design space partitioning with DIRECT

In practice, DIRECT has proven an effective heuristic for engineering design applications, for which it is able to quickly identify candidate solutions that can be further refined with fast local optimizers.

DIRECT uses the solution_accuracy, constraint_penalty and show_misc_options specifications that are described in Coliny method dependent controls. Note, however, that DIRECT uses a fixed penalty value for constraint violations (i.e. it is not dynamically adapted as is done in coliny_pattern_search).

The division specification determines how DIRECT subdivides each subregion of the search space. If division is set to major_dimension, then the dimension representing the longest edge of the subregion is subdivided (this is the default). If division is set to all_dimensions, then all dimensions are simultaneously subdivided.

Each subregion considered by DIRECT has a size, which corresponds to the longest diagonal of the subregion. The global_balance_parameter controls how much global search is performed by only allowing a subregion to be subdivided if the size of the subregion divided by the size of the largest subregion is at least global_balance_parameter. Intuitively, this forces large subregions to be subdivided before the smallest subregions are refined. The local_balance_parameter provides a tolerance for estimating whether the smallest subregion can provide a sufficient decrease to be worth subdividing; the default value is a small value that is suitable for most applications.

DIRECT can be terminated with the standard max_function_evaluations and solution_accuracy specifications. Additionally, the max_boxsize_limit specification terminates DIRECT if the size of the largest subregion falls below this threshold, and the min_boxsize_limit specification terminates DIRECT if the size of the smallest subregion falls below this threshold. In practice, this latter specification is likely to be more effective at limiting DIRECT's search.

Table 5.11 summarizes the DIRECT specification.

Table 5.11 Specification detail for the DIRECT method
Description Keyword Associated Data Status Default
DIRECT method coliny_direct none Required group N/A
Box subdivision approach division major_dimension | all_dimensions Optional group major_dimension
Global search balancing parameter global_balance_parameter real Optional 0.0
Local search balancing parameter local_balance_parameter real Optional 1.e-8
Maximum boxsize limit max_boxsize_limit real Optional 0.0
Minimum boxsize limit min_boxsize_limit real Optional 0.0001
Constraint penalty constraint_penalty real Optional 1000.0

Evolutionary Algorithms

DAKOTA currently provides several variants of evolutionary algorithms, invoked through the coliny_ea group specification.

The basic steps of an evolutionary algorithm are as follows:

  1. Select an initial population randomly and perform function evaluations on these individuals
  2. Perform selection for parents based on relative fitness
  3. Apply crossover and mutation to generate new_solutions_generated new individuals from the selected parents
  4. Perform function evaluations on the new individuals
  5. Perform replacement to determine the new population
  6. Return to step 2 and continue the algorithm until convergence criteria are satisfied or iteration limits are exceeded

Table 5.12 provides the specification detail for the controls for seeding the method, initializing a population, and for selecting and replacing population members.

Table 5.12 Specification detail for the Coliny EA method dependent controls: seed, initialization, selection, and replacement
Description Keyword Associated Data Status Default
EA selection coliny_ea none Required group N/A
Random seed seed integer Optional randomly generated seed
Number of population members population_size integer Optional 50
Initialization type initialization_type simple_random | unique_random | flat_file Required unique_random
Fitness type fitness_type linear_rank | merit_function Optional linear_rank
Replacement type replacement_type random | chc | elitist Optional group elitist = 1
Random replacement type random integer Required N/A
CHC replacement type chc integer Required N/A
Elitist replacement type elitist integer Required N/A
New solutions generated new_solutions_generated integer Optional population_size - replacement_size

The random seed control provides a mechanism for making a stochastic optimization repeatable. That is, the use of the same random seed in identical studies will generate identical results. The population_size control specifies how many individuals will comprise the EA's population.

The initialization_type defines the type of initialization for the population of the EA. There are three types: simple_random, unique_random, and flat_file. simple_random creates initial solutions with random variable values according to a uniform random number distribution. It gives no consideration to any previously generated designs. The number of designs is specified by the population_size. unique_random is the same as simple_random, except that when a new solution is generated, it is checked against the rest of the solutions. If it duplicates any of them, it is rejected. flat_file allows the initial population to be read from a flat file. If flat_file is specified, a file name must be given.

The fitness_type controls how strongly differences in "fitness" (i.e., the objective function) are weighted in the process of selecting "parents" for crossover:

The replacement_type controls how current populations and newly generated individuals are combined to create a new population. Each of the replacement_type selections accepts an integer value, which is referred to below and in Table 5.12 as the replacement_size:

Table 5.13 show the controls in the EA method associated with crossover and mutation.

Table 5.13 Specification detail for the Coliny EA method: crossover and mutation
Description Keyword Associated Data Status Default
Crossover type crossover_type two_point | blend | uniform Optional group two_point
Crossover rate crossover_rate real Optional 0.8
Mutation type mutation_type replace_uniform | offset_normal | offset_cauchy | offset_uniform Optional group offset_normal
Mutation scale mutation_scale real Optional 0.1
Mutation range mutation_range integer Optional 1
Mutation dimension ratio dimension_ratio real Optional 1.0
Mutation rate mutation_rate real Optional 1.0
Non-adaptive mutation flag non_adaptive none Optional Adaptive mutation

The crossover_type controls what approach is employed for combining parent genetic information to create offspring, and the crossover_rate specifies the probability of a crossover operation being performed to generate a new offspring. The Coliny EA method supports three forms of crossover, two_point, blend, and uniform, which generate a new individual through combinations of two parent individuals. Two-point crossover divides each parent into three regions, where offspring are created from the combination of the middle region from one parent and the end regions from the other parent. Since the Coliny EA does not utilize bit representations of variable values, the crossover points only occur on coordinate boundaries, never within the bits of a particular coordinate. Uniform crossover creates offspring through random combination of coordinates from the two parents. Blend crossover generates a new individual randomly along the multidimensional vector connecting the two parents.

The mutation_type controls what approach is employed in randomly modifying continuous design variables within the EA population. Each of the mutation methods generates coordinate-wise changes to individuals, usually by adding a random variable to a given coordinate value (an "offset" mutation), but also by replacing a given coordinate value with a random variable (a "replace" mutation). Discrete design variables are always mutated using the offset_uniform method. The mutation_rate controls the probability of mutation being performed on an individual, both for new individuals generated by crossover (if crossover occurs) and for individuals from the existing population. When mutation is performed, all dimensions of each individual are mutated. The mutation_scale specifies a scale factor which scales continuous mutation offsets; this is a fraction of the total range of each dimension, so mutation_scale is a relative value between 0 and 1. The mutation_range is used to control offset_uniform mutation used for discrete parameters. The replacement discrete value is the original value plus or minus an integer value up to mutation_range. The offset_normal, offset_cauchy, and offset_uniform mutation types are "offset" mutations in that they add a 0-mean random variable with a normal, cauchy, or uniform distribution, respectively, to the existing coordinate value. These offsets are limited in magnitude by mutation_scale. The replace_uniform mutation type is not limited by mutation_scale; rather it generates a replacement value for a coordinate using a uniformly distributed value over the total range for that coordinate.

The Coliny EA method uses self-adaptive mutation, which modifies the mutation scale dynamically. This mechanism is borrowed from EAs like evolution strategies. The non_adaptive flag can be used to deactivate the self-adaptation, which may facilitate a more global search.

Pattern Search

Pattern search techniques are nongradient-based optimization methods which use a set of offsets from the current iterate to locate improved points in the design space. The Coliny pattern search technique is invoked using a coliny_pattern_search group specification, which includes a variety of specification components.

Traditional pattern search methods search with a fixed pattern of search directions to try to find improvements to the current iterate. The Coliny pattern search methods generalize this simple algorithmic strategy to enable control of how the search pattern is adapted, as well as how each search pattern is evaluated. The stochastic and synchronization specifications denote how the the trial points are evaluated. The stochastic specification indicates that the trial points are considered in a random order. For parallel pattern search, synchronization dictates whether the evaluations are scheduled using a blocking scheduler or a nonblocking scheduler (i.e., Model::synchronize() or Model::synchronize_nowait(), respectively). In the blocking case, all points in the pattern are evaluated (in parallel), and if the best of these trial points is an improving point, then it becomes the next iterate. These runs are reproducible, assuming use of the same seed in the stochastic case. In the nonblocking case, all points in the pattern may not be evaluated, since the first improving point found becomes the next iterate. Since the algorithm steps will be subject to parallel timing variabilities, these runs will not generally be repeatable. The synchronization specification has similar connotations for sequential pattern search. If blocking is specified, then each sequential iteration terminates after all trial points have been considered, and if nonblocking is specified, then each sequential iteration terminates after the first improving trial point is evaluated.

The particular form of the search pattern is controlled by the pattern_basis specification. If pattern_basis is coordinate basis, then the pattern search uses a plus and minus offset in each coordinate direction, for a total of 2n function evaluations in the pattern. If pattern_basis is simplex, then pattern search uses a minimal positive basis simplex for the parameter space, for a total of n+1 function evaluations in the pattern. Note that the simplex pattern basis can be used for unbounded problems only. The total_pattern_size specification can be used to augment the basic coordinate and simplex patterns with additional function evaluations, and is particularly useful for parallel load balancing. For example, if some function evaluations in the pattern are dropped due to duplication or bound constraint interaction, then the total_pattern_size specification instructs the algorithm to generate new offsets to bring the total number of evaluations up to this consistent total.

The exploratory_moves specification controls how the search pattern is adapted. (The search pattern can be adapted after an improving trial point is found, or after all trial points in a search pattern have been found to be unimproving points.) The following exploratory moves selections are supported by Coliny:

The initial_delta and threshold_delta specifications provide the initial offset size and the threshold size at which to terminate the algorithm. For any dimension that has both upper and lower bounds, this step length will be internally rescaled to provide search steps of length initial_delta * range. This rescaling does not occur for other dimensions, so search steps in those directions have length initial_delta.

In general, pattern search methods can expand and contract their step lengths. Coliny pattern search methods contract the step length by the value contraction_factor, and they expand the step length by the value (1/contraction_factor). The expand_after_success control specifies how many successful objective function improvements must occur with a specific step length prior to expansion of the step length, whereas the no_expansion flag instructs the algorithm to forgo pattern expansion altogether.

Finally, constraint infeasibility can be managed in a somewhat more sophisticated manner than the simple weighted penalty function. If the constant_penalty specification is used, then the simple weighted penalty scheme described above is used. Otherwise, the constraint penalty is adapted to the value constraint_penalty/L, where L is the the smallest step length used so far.

Table 5.14 and Table 5.15 provide the specification detail for the Coliny pattern search method and its method dependent controls.

Table 5.14 Specification detail for the Coliny pattern search method: randomization, delta, and constraint controls
Description Keyword Associated Data Status Default
Coliny pattern search method coliny_pattern_search none Required group N/A
Stochastic pattern search stochastic none Optional group N/A
Random seed for stochastic pattern search seed integer Optional randomly generated seed
Initial offset value initial_delta real Required N/A
Threshold for offset values threshold_delta real Required N/A
Constraint penalty constraint_penalty real Optional 1.0
Control of dynamic penalty constant_penalty none Optional algorithm dynamically adapts the constraint penalty

Table 5.15 Specification detail for the Coliny pattern search method: pattern controls
Description Keyword Associated Data Status Default
Pattern basis selection pattern_basis coordinate | simplex Optional coordinate
Total number of points in pattern total_pattern_size integer Optional no augmentation of basic pattern
No expansion flag no_expansion none Optional algorithm may expand pattern size
Number of consecutive improvements before expansion expand_after_success integer Optional 1
Pattern contraction factor contraction_factor real Optional 0.5
Evaluation synchronization synchronization blocking | nonblocking Optional nonblocking
Exploratory moves selection exploratory_moves basic_pattern | multi_step | adaptive_pattern Optional basic_pattern

Solis-Wets

DAKOTA's implementation of Coliny also contains the Solis-Wets algorithm. The Solis-Wets method is a simple greedy local search heuristic for continuous parameter spaces. Solis-Wets generates trial points using a multivariate normal distribution, and unsuccessful trial points are reflected about the current point to find a descent direction. This algorithm is inherently serial and will not utilize any parallelism. Table 5.16 provides the specification detail for this method and its method dependent controls.

Table 5.16 Specification detail for the Coliny Solis-Wets method
Description Keyword Associated Data Status Default
Coliny Solis-Wets method coliny_solis_wets none Required group N/A
Random seed for stochastic pattern search seed integer Optional randomly generated seed
Initial offset value initial_delta real Required N/A
Threshold for offset values threshold_delta real Required N/A
No expansion flag no_expansion none Optional algorithm may expand pattern size
Number of consecutive improvements before expansion expand_after_success integer Optional 5
Number of consecutive failures before contraction contract_after_failure integer Optional 3
Pattern contraction factor contraction_factor real Optional 0.5
Constraint penalty constraint_penalty real Optional 1.0
Control of dynamic penalty constant_penalty none Optional algorithm dynamically adapts the constraint penalty

These specifications have the same meaning as corresponding specifications for coliny_pattern_search. In particular, coliny_solis_wets supports dynamic rescaling of the step length, and dynamic rescaling of the constraint penalty. The only new specification is contract_after_failure, which specifies the number of unsuccessful cycles which must occur with a specific delta prior to contraction of the delta.

NCSU Methods

North Carolina State University (NCSU) has an implementation of the DIRECT algorithm (DIviding RECTangles algorithm that is outlined in the Coliny method section above). This version is documented in [Gablonsky, 2001.] We have found that the NCSU DIRECT implementation works better and is more robust for some problems than coliny_direct. Currently, we maintain both versions of DIRECT in DAKOTA; in the future, we may deprecate one. The NCSU DIRECT method is selected with ncsu_direct. We have tried to maintain consistency between the keywords in COLINY and NCSU implementation of DIRECT, but the algorithms have different parameters, so the keywords sometimes have slightly different meaning.

NCSU method independent controls

The method independent controls for max_iterations and max_function_evaluations limit the number of iterations and the number of function evaluations that can be performed during an NCSU DIRECT optimization.

NCSU method dependent controls

There are three specification controls for NCSU DIRECT: solution_accuracy, min_boxsize_limit, and volume_boxsize_limit. The solution accuracy specifies a percent error on the optimization. This is used for test problems, when the true global minimum is known (call it fglobal). Then, the optimization terminates when 100(f_min-fglobal)/max(1,abs(fglobal) < solution_accuracy. The default for fglobal is 1E100 and the default for solution accuracy is 0.

min_boxsize_limit is a setting that terminates the optimization when the measure of a hyperrectangle S with f(c(S)) = fmin is less than min_boxsize_limit. volume_boxsize_limit is a setting that terminates the optimization when the volume of a hyperrectangle S with f(c(S)) = fmin is less than volume_boxsize_limit percent of the original hyperrectangle. Basically, volume_boxsize_limit stops the optimization when the volume of the particular rectangle which has fmin is less than a certain percentage of the whole volume. Min_boxsize_limit uses an arbitrary measure to stop the optimization. The defaults for both of these are 1E-8. The keywords for NCSU DIRECT are described in Table 5.17 below.

Table 5.17 Specification detail for the NCSU DIRECT method
Description Keyword Associated Data Status Default
Solution Accuracy solution_accuracy real Optional 0
Min boxsize limit min_boxsize_limit real between [0,1] Optional 1.e-8
Volume boxsize limit vol_boxsize_limit real between [0,1] Optional 1.e-8

Surrogate-Based Methods

In surrogate-based optimization (SBO) and surrogate-based nonlinear least squares (SBNLS), minimization occurs using a set of one or more approximations, defined from a surrogate model, that are built and periodically updated using data from a "truth" model. The surrogate model can be a global data fit (e.g., regression or interpolation of data generated from a design of computer experiments), a multipoint approximation, a local Taylor Series expansion, or a model hierarchy approximation (e.g., a low-fidelity simulation model), whereas the truth model involves a high-fidelity simulation model. The goals of surrogate-based methods are to reduce the total number of truth model simulations and, in the case of global data fit surrogates, to smooth noisy data with an easily navigated analytic function.

Surrogate-Based Local Method

In the surrogate-based local (SBL) method, a trust region approach is used to manage the minimization process to maintain acceptable accuracy between the surrogate model and the truth model (by limiting the range over which the surrogate model is trusted). The process involves a sequence of minimizations performed on the surrogate model and bounded by the trust region. At the end of each approximate minimization, the candidate optimum point is validated using the truth model. If sufficient decrease has been obtained in the truth model, the trust region is re-centered around the candidate optimum point and the trust region will either shrink, expand, or remain the same size depending on the accuracy with which the surrogate model predicted the truth model decrease. If sufficient decrease has not been attained, the trust region center is not updated and the entire trust region shrinks by a user-specified factor. The cycle then repeats with the construction of a new surrogate model, a minimization, and another test for sufficient decrease in the truth model. This cycle continues until convergence is attained.

The surrogate_based_local method must specify an optimization or least squares sub-method either by pointer using approx_method_pointer (e.g., 'NLP1') or by name using approx_method_name (e.g., 'npsol_sqp'). The former identifies a full sub-method specification for the sub-problem minimizer (which allows non-default minimizer settings), whereas the latter supports a streamlined specification (that employs default minimizer settings). For both cases, the surrogate_based_local method specification is responsible for using its model_pointer (see Method Independent Controls) to select a surrogate model (see Surrogate Model Controls). Any model_pointer identified in an approximate sub-method specification is ignored.

SBL algorithm controls include max_iterations (the maximum number of SBL cycles allowed), convergence_tolerance (the relative tolerance used in internal SBL convergence assessments), soft_convergence_limit (a soft convergence control for the SBL iterations which limits the number of consecutive iterations with improvement less than the convergence tolerance), and truth_surrogate_bypass (a flag for bypassing all lower level surrogates when performing truth verifications on a top level surrogate). Table 5.18 summarizes these SBL inputs.

Table 5.18 Specification detail for surrogate-based local minimization method
Description Keyword Associated Data Status Default
Surrogate-based local method surrogate_based_local none Required group N/A
Approximate sub-problem minimization method pointer approx_method_pointer string Required (1 of 2 selections) N/A
Approximate sub-problem minimization method name approx_method_name string Required (1 of 2 selections) N/A
Maximum number of SBL iterations max_iterations integer Optional 100
Convergence tolerance for SBL iterations convergence_tolerance real Optional 1.e-4
Soft convergence limit for SBL iterations soft_convergence_limit integer Optional 5
Flag for bypassing lower level surrogates in truth verifications truth_surrogate_bypass none Optional no bypass

The trust_region optional group specification can be used to specify the initial size of the trust region (using initial_size) relative to the total variable bounds, the minimum size of the trust region (using minimum_size), the contraction factor for the trust region size (using contraction_factor) used when the surrogate model is performing poorly, and the expansion factor for the trust region size (using expansion_factor) used when the the surrogate model is performing well. Two additional commands are the trust region size contraction threshold (using contract_threshold) and the trust region size expansion threshold (using expand_threshold). These two commands are related to what is called the trust region ratio, which is the actual decrease in the truth model divided by the predicted decrease in the truth model in the current trust region. The command contract_threshold sets the minimum acceptable value for the trust region ratio, i.e., values below this threshold cause the trust region to shrink for the next SBL iteration. The command expand_threshold determines the trust region value above which the trust region will expand for the next SBL iteration. Table 5.19 summarizes these trust region inputs.

Table 5.19 Specification detail for trust region controls in surrogate-based local methods
Description Keyword Associated Data Status Default
Trust region group specification trust_region none Optional group N/A
Trust region initial size (relative to bounds) initial_size real Optional 0.4
Trust region minimum size minimum_size real Optional 1.e-6
Shrink trust region if trust region ratio is below this value contract_threshold real Optional 0.25
Expand trust region if trust region ratio is above this value expand_threshold real Optional 0.75
Trust region contraction factor contraction_factor real Optional 0.25
Trust region expansion factor expansion_factor real Optional 2.0

For SBL problems with nonlinear constraints, a number of algorithm formulations exist as described in [Eldred and Dunlavy, 2006] and as summarized in the Advanced Examples section of the Models chapter of the Users Manual. First, the "primary" functions (that is, the objective functions or least squares terms) in the approximate subproblem can be selected to be surrogates of the original primary functions (original_primary), a single objective function (single_objective) formed from the primary function surrogates, or either an augmented Lagrangian merit function (augmented_lagrangian_objective) or a Lagrangian merit function (lagrangian_objective) formed from the primary and secondary function surrogates. The former option may imply the use of a nonlinear least squares method, a multiobjective optimization method, or a single objective optimization method to solve the approximate subproblem, depending on the definition of the primary functions. The latter three options all imply the use of a single objective optimization method regardless of primary function definition. Second, the surrogate constraints in the approximate subproblem can be selected to be surrogates of the original constraints (original_constraints) or linearized approximations to the surrogate constraints (linearized_constraints), or constraints can be omitted from the subproblem (no_constraints). Following optimization of the approximate subproblem, the candidate iterate is evaluated using a merit function, which can be selected to be a simple penalty function with penalty ramped by SBL iteration number (penalty_merit), an adaptive penalty function where the penalty ramping may be accelerated in order to avoid rejecting good iterates which decrease the constraint violation (adaptive_penalty_merit), a Lagrangian merit function which employs first-order Lagrange multiplier updates (lagrangian_merit), or an augmented Lagrangian merit function which employs both a penalty parameter and zeroth-order Lagrange multiplier updates (augmented_lagrangian_merit). When an augmented Lagrangian is selected for either the subproblem objective or the merit function (or both), updating of penalties and multipliers follows the approach described in [Conn et al., 2000]. Following calculation of the merit function for the new iterate, the iterate is accepted or rejected and the trust region size is adjusted for the next SBL iteration. Iterate acceptance is governed either by a trust region ratio (tr_ratio) formed from the merit function values or by a filter method (filter); however, trust region resizing logic is currently based only on the trust region ratio. For infeasible iterates, constraint relaxation can be used for balancing constraint satisfaction and progress made toward an optimum. The command constraint_relax followed by a method name specifies the type of relaxation to be used. Currently, homotopy [Perez et al., 2004] is the only available method for constraint relaxation, and this method is dependent on the presence of the NPSOL library within the DAKOTA executable. Table 5.20 summarizes these constraint management inputs.

Table 5.20 Specification detail for constraint management in surrogate-based local methods
Description Keyword Associated Data Status Default
Approximate subproblem formulation approx_subproblem original_primary | single_objective | augmented_lagrangian_objective | lagrangian_objective
original_constraints | linearized_constraints | no_constraints
Optional group original_primary
original_constraints
SBL merit function merit_function penalty_merit | adaptive_penalty_merit | lagrangian_merit | augmented_lagrangian_merit Optional group augmented_lagrangian_merit
SBL iterate acceptance logic acceptance_logic tr_ratio | filter Optional group filter
SBL constraint relaxation method for infeasible iterates constraint_relax homotopy Optional group no relaxation

Surrogate-Based Global Method

The surrogate_based_global method differs from the surrogate_based_local method in a few ways. First, surrogate_based_global is not a trust region method. Rather, surrogate_based_global works in an iterative scheme where optimization is performed on a global surrogate using the same bounds during each iteration. In one iteration, the optimal solutions of the surrogate model are found, and then a selected set of these optimal surrogate solutions are passed to the next iteration. At the next iteration, these surrogate points are evaluated with the "truth" model, and then these points are added back to the set of points upon which the next surrogate is constructed. In this way, the optimization acts on a more accurate surrogate during each iteration, presumably driving to optimality quickly. This approach has no guarantee of convergence. It was originally designed for MOGA (a multi-objective genetic algorithm). Since genetic algorithms often need thousands or tens of thousands of points to produce optimal or near-optimal solutions, the use of surrogates can be helpful for reducing the truth model evaluations. Instead of creating one set of surrogates for the individual objectives and running the optimization algorithm on the surrogate once, the idea is to select points along the (surrogate) Pareto frontier, which can be used to supplement the existing points. In this way, one does not need to use many points initially to get a very accurate surrogate. The surrogate becomes more accurate as the iterations progress. Note that the user has the option of appending the optimal points from the surrogate model to the current set of truth points or using the optimal points from the surrogate model to replace the optimal set of points from the previous iteration. Although appending to the set is the default behavior, at this time we strongly recommend using the option replace_points because it appears to be more accurate and robust.

As for the surrogate_based_local method, the surrogate_based_global specification must identify a sub-method using either approx_method_pointer or approx_method_name and must identify a surrogate model (see Surrogate Model Controls) using its model_pointer (see Method Independent Controls). The only other algorithm control at this time is max_iterations (the maximum number of surrogate update cycles allowed). Table 5.21 summarizes these surrogate based global inputs.

Table 5.21 Specification detail for the surrogate-based global method
Description Keyword Associated Data Status Default
Surrogate-based global method surrogate_based_global none Required group N/A
Approximate sub-problem minimization method pointer approx_method_pointer string Required (1 of 2 selections) N/A
Approximate sub-problem minimization method name approx_method_name string Required (1 of 2 selections) N/A
Maximum number of surrogate update iterations max_iterations integer Optional 100
Replace points used in surrogate construction with best points from previous iteration replace_points none Optional Points appended, not replaced

We have two cautionary notes before using the surrogate-based global method:

Efficient Global Method

The Efficient Global Optimization (EGO) method was first developed by Jones, Schonlau, and Welch [Jones et al., 1998]. In EGO, a stochastic response surface approximation for the objective function is developed based on some sample points from the "true" simulation. The particular response surface used is a Gaussian process (GP). The GP allows one to calculate the prediction at a new input location as well as the uncertainty associated with that prediction. The key idea in EGO is to maximize the Expected Improvement Function (EIF). The EIF is used to select the location at which a new training point should be added to the Gaussian process model by maximizing the amount of improvem