Uncertain variables are continuous variables which are characterized by probability distributions. The distribution type can be normal, lognormal, uniform, loguniform, triangular, exponential, beta, gamma, gumbel, frechet, weibull, or histogram. In addition to the uncertain variables defined by probability distributions, we have an uncertain variable type called interval, where the uncertainty in a variable is described by one or more interval values in which the variable may lie. The interval uncertain variable type is used in epistemic uncertainty calculations, specifically Dempster-Shafer theory of evidence.
Each uncertain variable specification contains descriptive tags and most contain, either explicitly or implicitly, distribution lower and upper bounds. Distribution lower and upper bounds are explicit portions of the normal, lognormal, uniform, loguniform, triangular, and beta specifications, whereas they are implicitly defined for histogram and interval variables from the extreme values within the bin/point/interval specifications. When used with design of experiments and multidimensional parameter studies, distribution bounds are also inferred for normal and lognormal (if optional bounds are unspecified) as well as for exponential, gamma, gumbel, frechet, and weibull (which have no bounds specification); these bounds are [0,
] for exponential, gamma, frechet, weibull and unspecified lognormal, and [
,
] for gumbel and unspecified normal. In addition to tags and bounds specifications, normal variables include mean and standard deviation specifications, lognormal variables include mean and either standard deviation or error factor specifications, triangular variables include mode specifications, exponential variables include beta specifications, beta, gamma, gumbel, frechet, and weibull variables include alpha and beta specifications, histogram variables include bin pairs and point pairs specifications, and interval variables include basic probability assignments per interval.
State variables can be continuous or discrete and consist of "other" variables which are to be mapped through the simulation interface. Each state variable specification can have an initial state, lower and upper bounds, and descriptors. State variables provide a convenient mechanism for parameterizing additional model inputs, such as mesh density, simulation convergence tolerances and time step controls, and can be used to enact model adaptivity in future strategy developments.
Several examples follow. In the first example, two continuous design variables are specified:
variables, continuous_design = 2 initial_point 0.9 1.1 upper_bounds 5.8 2.9 lower_bounds 0.5 -2.9 descriptors 'radius' 'location'
In the next example, defaults are employed. In this case, initial_point will default to a vector of 0. values, upper_bounds will default to vector values of DBL_MAX (the maximum number representable in double precision for a particular platform, as defined in the platform's float.h C header file), lower_bounds will default to a vector of -DBL_MAX values, and descriptors will default to a vector of 'cdv_i' strings, where i ranges from one to two:
variables, continuous_design = 2
In the following example, the syntax for a normal-lognormal distribution is shown. One normal and one lognormal uncertain variable are completely specified by their means and standard deviations. In addition, the dependence structure between the two variables is specified using the uncertain_correlation_matrix.
variables,
normal_uncertain = 1
means = 1.0
std_deviations = 1.0
descriptors = 'TF1n'
lognormal_uncertain = 1
means = 2.0
std_deviations = 0.5
descriptors = 'TF2ln'
uncertain_correlation_matrix = 1.0 0.2
0.2 1.0
An example of the syntax for a state variables specification follows:
variables,
continuous_state = 1
initial_state 4.0
lower_bounds 0.0
upper_bounds 8.0
descriptors 'CS1'
discrete_state = 1
initial_state 104
lower_bounds 100
upper_bounds 110
descriptors 'DS1'
And in a more advanced example, a variables specification containing a set identifier, continuous and discrete design variables, normal and uniform uncertain variables, and continuous and discrete state variables is shown:
variables, id_variables = 'V1' continuous_design = 2 initial_point 0.9 1.1 upper_bounds 5.8 2.9 lower_bounds 0.5 -2.9 descriptors 'radius' 'location' discrete_design = 1 initial_point 2 upper_bounds 1 lower_bounds 3 descriptors 'material' normal_uncertain = 2 means = 248.89, 593.33 std_deviations = 12.4, 29.7 descriptors = 'TF1n' 'TF2n' uniform_uncertain = 2 lower_bounds = 199.3, 474.63 upper_bounds = 298.5, 712. descriptors = 'TF1u' 'TF2u' continuous_state = 2 initial_state = 1.e-4 1.e-6 descriptors = 'EPSIT1' 'EPSIT2' discrete_state = 1 initial_state = 100 descriptors = 'load_case'
Refer to the DAKOTA Users Manual [Eldred et al., 2007] for discussion on how different iterators view these mixed variable sets.
variables, <set identifier> <continuous design variables specification> <discrete design variables specification> <normal uncertain variables specification> <lognormal uncertain variables specification> <uniform uncertain variables specification> <loguniform uncertain variables specification> <triangular uncertain variables specification> <exponential uncertain variables specification> <beta uncertain variables specification> <gamma uncertain variables specification> <gumbel uncertain variables specification> <frechet uncertain variables specification> <weibull uncertain variables specification> <histogram uncertain variables specification> <interval uncertain variables specification> <uncertain correlation specification> <continuous state variables specification> <discrete state variables specification>
Referring to dakota.input.txt, it is evident from the enclosing brackets that the set identifier specification, the uncertain correlation specification, and each of the variables specifications are all optional. The set identifier and uncertain correlation are stand-alone optional specifications, whereas the variables specifications are optional group specifications, meaning that the group can either appear or not as a unit. If any part of an optional group is specified, then all required parts of the group must appear.
The optional status of the different variable type specifications allows the user to specify only those variables which are present (rather than explicitly specifying that the number of a particular type of variables = 0). However, at least one type of variables must have nonzero size or an input error message will result. The following sections describe each of these specification components in additional detail.
id_variables to input a unique string for use in identifying a particular variables set. A model can then identify the use of this variables set by specifying the same string in its variables_pointer specification (see Model Independent Controls). For example, a model whose specification contains variables_pointer = 'V1' will use a variables specification containing the set identifier id_variables = 'V1'.
If the id_variables specification is omitted, a particular variables set will be used by a model only if that model omits specifying a variables_pointer and if the variables set was the last set parsed (or is the only set parsed). In common practice, if only one variables set exists, then id_variables can be safely omitted from the variables specification and variables_pointer can be omitted from the model specification(s), since there is no potential for ambiguity in this case. Table 7.1 summarizes the set identifier inputs.
| Description | Keyword | Associated Data | Status | Default |
| Variables set identifier | id_variables | string | Optional | use of last variables parsed |
| Description | Keyword | Associated Data | Status | Default |
| Continuous design variables | continuous_design | integer | Optional group | no continuous design variables |
| Initial point | initial_point | list of reals | Optional | vector values = 0. |
| Lower bounds | lower_bounds | list of reals | Optional | vector values = -DBL_MAX |
| Upper bounds | upper_bounds | list of reals | Optional | vector values = +DBL_MAX |
| Scaling types | scale_types | list of strings | Optional | vector values = 'none' |
| Scales | scales | list of reals | Optional | vector values = 1. (no scaling) |
| Descriptors | descriptors | list of strings | Optional | vector of 'cdv_i' where i = 1,2,3... |
| Description | Keyword | Associated Data | Status | Default |
| Discrete design variables | discrete_design | integer | Optional group | no discrete design variables |
| Initial point | initial_point | list of integers | Optional | vector values = 0 |
| Lower bounds | lower_bounds | list of integers | Optional | vector values = INT_MIN |
| Upper bounds | upper_bounds | list of integers | Optional | vector values = INT_MAX |
| Descriptors | descriptors | list of strings | Optional | vector of 'ddv_i' where i = 1,2,3,... |
The initial_point specifications provide the point in design space from which an iterator is started for the continuous and discrete design variables, respectively. The lower_bounds and upper_bounds restrict the size of the feasible design space and are frequently used to prevent nonphysical designs. The scale_types specification includes strings specifying the scaling type for each component of the continuous design variables vector in methods that support scaling, when scaling is enabled (see Method Independent Controls for details). Each entry in scale_types may be selected from 'none', 'value', 'auto', or 'log', to select no, characteristic value, automatic, or logarithmic scaling, respectively. If a single string is specified it will apply to all components of the continuous design variables vector. Each entry in scales may be a user-specified nonzero real characteristic value to be used in scaling each variable component. These values are ignored for scaling type 'none', required for 'value', and optional for 'auto' and 'log'. If a single real value is specified it will apply to all components of the continuous design variables vector. The descriptors specifications supply strings which will be replicated through the DAKOTA output to help identify the numerical values for these parameters. Default values for optional specifications are zeros for initial values, positive and negative machine limits for upper and lower bounds (+/- DBL_MAX, INT_MAX, INT_MIN from the float.h and limits.h system header files), and numbered strings for descriptors. As for linear and nonlinear inequality constraint bounds (see Method Independent Controls and Objective and constraint functions (optimization data set)), a nonexistent upper bound can be specified by using a value greater than the "big bound size" constant (1.e+30 for continuous design variables, 1.e+9 for discrete design variables) and a nonexistent lower bound can be specified by using a value less than the negation of these constants (-1.e+30 for continuous, -1.e+9 for discrete), although not all optimizers currently support this feature (e.g., DOT and CONMIN will treat these large bound values as actual variable bounds, but this should not be problematic in practice).
The inclusion of lower and upper distribution bounds for all uncertain variable types (either explicitly defined, implicitly defined, or inferred; see Variables Description) allows the use of these variables with methods that rely on a bounded region to define a set of function evaluations (i.e., design of experiments and some parameter study methods). In addition, distribution bounds can be used to truncate the tails of distributions for normal and lognormal uncertain variables (see "bounded normal", "bounded lognormal", and "bounded lognormal-n" distribution types in [Wyss and Jorgensen, 1998]). Default upper and lower bounds are positive and negative machine limits (+/- DBL_MAX from the float.h system header file), respectively, for non-logarithmic distributions and positive machine limits and zeros, respectively, for logarithmic distributions. The uncertain variable descriptors provide strings which will be replicated through the DAKOTA output to help identify the numerical values for these parameters. Default values for descriptors are numbered strings. Tables 7.4 through 7.17 summarize the details of the uncertain variable specifications.
The density function for the normal distribution is:
where
and
are the mean and standard deviation of the normal distribution, respectively.
Note that if you specify bounds for a normal distribution, the sampling occurs from the underlying distribution with the given mean and standard deviation, but samples are not taken outside the bounds. This can result in the mean and the standard deviation of the sample data being different from the mean and standard deviation of the underlying distribution. For example, if you are sampling from a normal distribution with a mean of 5 and a standard deviation of 3, but you specify bounds of 1 and 7, the resulting mean of the samples will be around 4.3 and the resulting standard deviation will be around 1.6. This is because you have bounded the original distribution significantly, and asymetrically, since 7 is closer to the original mean than 1.
| Description | Keyword | Associated Data | Status | Default |
| normal uncertain variables | normal_uncertain | integer | Optional group | no normal uncertain variables |
| normal uncertain means | means | list of reals | Required | N/A |
| normal uncertain standard deviations | std_deviations | list of reals | Required | N/A |
| Distribution lower bounds | lower_bounds | list of reals | Optional | vector values = -DBL_MAX |
| Distribution upper bounds | upper_bounds | list of reals | Optional | vector values = +DBL_MAX |
| Descriptors | descriptors | list of strings | Optional | vector of 'nuv_i' where i = 1,2,3,... |
, then X is distributed with a lognormal distribution. The lognormal is often used to model time to perform some task. It can also be used to model variables which are the product of a large number of other quantities, by the Central Limit Theorem. Finally, the lognormal is used to model quantities which cannot have negative values. Within the lognormal uncertain optional group specification, the number of lognormal uncertain variables, the means, and either standard deviations or error factors must be specified, and the distribution lower and upper bounds and variable descriptors are optional specifications.
For the lognormal variables, DAKOTA's uncertainty quantification methods standardize on the use of statistics of the actual lognormal distribution, as opposed to statistics of the underlying normal distribution. This approach diverges from that of [Wyss and Jorgensen, 1998], which assumes that a specification of means and standard deviations provides parameters of the underlying normal distribution, whereas a specification of means and error factors provides statistics of the actual lognormal distribution. By binding the mean, standard deviation, and error factor parameters consistently to the actual lognormal distribution, inputs are more intuitive and require fewer conversions in most user applications. The conversion equations from lognormal mean
and either lognormal error factor
or lognormal standard deviation
to the mean
and standard deviation
of the underlying normal distribution are as follows:
Conversions from
and
back to
and
or
are as follows:
The density function for the lognormal distribution is:
| Description | Keyword | Associated Data | Status | Default |
| lognormal uncertain variables | lognormal_uncertain | integer | Optional group | no lognormal uncertain variables |
| lognormal uncertain means | means | list of reals | Required | N/A |
| lognormal uncertain standard deviations | std_deviations | list of reals | Required (1 of 2 selections) | N/A |
| lognormal uncertain error factors | error_factors | list of reals | Required (1 of 2 selections) | N/A |
| Distribution lower bounds | lower_bounds | list of reals | Optional | vector values = 0. |
| Distribution upper bounds | upper_bounds | list of reals | Optional | vector values = +DBL_MAX |
| Descriptors | descriptors | list of strings | Optional | vector of 'lnuv_i' where i = 1,2,3,... |
where
and
are the upper and lower bounds of the uniform distribution, respectively. The mean of the uniform distribution is
and the variance is
. Note that this distribution is a special case of the more general beta distribution.
| Description | Keyword | Associated Data | Status | Default |
| uniform uncertain variables | uniform_uncertain | integer | Optional group | no uniform uncertain variables |
| Distribution lower bounds | lower_bounds | list of reals | Required | N/A |
| Distribution upper bounds | upper_bounds | list of reals | Required | N/A |
| Descriptors | descriptors | list of strings | Optional | vector of 'uuv_i' where i = 1,2,3,... |
, then X is distributed with a loguniform distribution. Within the loguniform uncertain optional group specification, the number of loguniform uncertain variables and the distribution lower and upper bounds are required specifications, and variable descriptors is an optional specification. The loguniform distribution has the density function:
| Description | Keyword | Associated Data | Status | Default |
| loguniform uncertain variables | loguniform_uncertain | integer | Optional group | no loguniform uncertain variables |
| Distribution lower bounds | lower_bounds | list of reals | Required | N/A |
| Distribution upper bounds | upper_bounds | list of reals | Required | N/A |
| Descriptors | descriptors | list of strings | Optional | vector of 'luuv_i' where i = 1,2,3,... |
The density function for the triangular distribution is:
if
, and
if
, and 0 elsewhere. In these equations,
is the lower bound,
is the upper bound, and
is the mode of the triangular distribution.
| Description | Keyword | Associated Data | Status | Default |
| triangular uncertain variables | triangular_uncertain | integer | Optional group | no triangular uncertain variables |
| triangular uncertain modes | modes | list of reals | Required | N/A |
| Distribution lower bounds | lower_bounds | list of reals | Required | N/A |
| Distribution upper bounds | upper_bounds | list of reals | Required | N/A |
| Descriptors | descriptors | list of strings | Optional | vector of 'tuv_i' where i = 1,2,3,... |
The density function for the exponential distribution is given by:
where
and
. Note that this distribution is a special case of the more general gamma distribution.
| Description | Keyword | Associated Data | Status | Default |
| exponential uncertain variables | exponential_uncertain | integer | Optional group | no exponential uncertain variables |
| exponential uncertain betas | betas | list of reals | Required | N/A |
| Descriptors | descriptors | list of strings | Optional | vector of 'euv_i' where i = 1,2,3,... |
where
is the gamma function and
is the beta function. To calculate mean and standard deviation from the alpha, beta, upper bound, and lower bound parameters of the beta distribution, the following expressions may be used.
Solving these for
and
gives:
Note that the uniform distribution is a special case of this distribution for parameters
.
| Description | Keyword | Associated Data | Status | Default |
| beta uncertain variables | beta_uncertain | integer | Optional group | no beta uncertain variables |
| beta uncertain alphas | alphas | list of reals | Required | N/A |
| beta uncertain betas | betas | list of reals | Required | N/A |
| Distribution lower bounds | lower_bounds | list of reals | Required | N/A |
| Distribution upper bounds | upper_bounds | list of reals | Required | N/A |
| Descriptors | descriptors | list of strings | Optional | vector of 'buv_i' where i = 1,2,3,... |
The density function for the gamma distribution is given by:
where
and
. Note that the exponential distribution is a special case of this distribution for parameter
.
| Description | Keyword | Associated Data | Status | Default |
| gamma uncertain variables | gamma_uncertain | integer | Optional group | no gamma uncertain variables |
| gamma uncertain alphas | alphas | list of reals | Required | N/A |
| gamma uncertain betas | betas | list of reals | Required | N/A |
| Descriptors | escriptors | list of strings | Optional | vector of 'gauv_i' where i = 1,2,3,... |
The density function for the Gumbel distribution is given by:
where
and
.
| Description | Keyword | Associated Data | Status | Default |
| gumbel uncertain variables | gumbel_uncertain | integer | Optional group | no gumbel uncertain variables |
| gumbel uncertain alphas | alphas | list of reals | Required | N/A |
| gumbel uncertain betas | betas | list of reals | Required | N/A |
| Descriptors | descriptors | list of strings | Optional | vector of 'guuv_i' where i = 1,2,3,... |
The density function for the frechet distribution is:
where
and ![$\sigma_F^2 = \beta^2[\Gamma(1-\frac{2}{\alpha})-\Gamma^2(1-\frac{1}{\alpha})]$](form_87.png)
| Description | Keyword | Associated Data | Status | Default |
| frechet uncertain variables | frechet_uncertain | integer | Optional group | no frechet uncertain variables |
| frechet uncertain alphas | alphas | list of reals | Required | N/A |
| frechet uncertain betas | betas | list of reals | Required | N/A |
| Descriptors | descriptors | list of strings | Optional | vector of 'fuv_i' where i = 1,2,3,... |
The density function for the weibull distribution is given by:
where
and 
| Description | Keyword | Associated Data | Status | Default |
| weibull uncertain variables | weibull_uncertain | integer | Optional group | no weibull uncertain variables |
| weibull uncertain alphas | alphas | list of reals | Required | N/A |
| weibull uncertain betas | betas | list of reals | Required | N/A |
| Descriptors | descriptors | list of strings | Optional | vector of 'wuv_i' where i = 1,2,3,... |
For the histogram uncertain variable specification, the bin pairs and point pairs specifications provide sets of (x,y) pairs for each histogram variable. The distinction between the two types is that the former specifies counts for bins of non-zero width, whereas the latter specifies counts for individual point values, which can be thought of as bins with zero width. In the terminology of LHS [Wyss and Jorgensen, 1998], the former is a "continuous linear histogram" and the latter is a "discrete histogram" (although the points are real-valued, the number of possible values is finite). To fully specify a bin-based histogram with n bins where the bins can be of unequal width, n+1 (x,y) pairs must be specified with the following features:
x is the parameter value for the left boundary of a histogram bin and y is the corresponding count for that bin. y value of zero. x values must be strictly increasing. y values must be positive, except for the last which must be zero. (x,y) pairs must be specified for each bin-based histogram.n points, n (x,y) pairs must be specified with the following features:
x is the point value and y is the corresponding count for that value. x values must be strictly increasing. y values must be positive. (x,y) pair must be specified for each point-based histogram.(x,y) pairs with individual histogram variables. For example, in the following specification histogram_uncertain = 3 num_bin_pairs = 3 4 bin_pairs = 5 17 8 21 10 0 .1 12 .2 24 .3 12 .4 0 num_point_pairs = 2 point_pairs = 3 1 4 1
num_bin_pairs associates the first 3 pairs from bin_pairs ((5,17),(8,21),(10,0)) with one bin-based histogram variable and the following set of 4 pairs ((.1,12),(.2,24),(.3,12),(.4,0)) with a second bin-based histogram variable. Likewise, num_point_pairs associates both of the (x,y) pairs from point_pairs ((3,1),(4,1)) with a single point-based histogram variable. Finally, the total number of bin-based variables and point-based variables must add to the total number of histogram variables specified (3 in this example).
| Description | Keyword | Associated Data | Status | Default |
| histogram uncertain variables | histogram_uncertain | integer | Optional group | no histogram uncertain variables |
number of (x,y) pairs for each bin-based histogram variable | num_bin_pairs | list of integers | Optional group | no bin-based histogram uncertain variables |
(x,y) pairs for all bin-based histogram variables | bin_pairs | list of reals | Optional group | no bin-based histogram uncertain variables |
number of (x,y) pairs for each point-based histogram variable | num_point_pairs | list of integers | Optional group | no point-based histogram uncertain variables |
(x,y) pairs for all point-based histogram variables | point_pairs | list of reals | Optional group | no point-based histogram uncertain variables |
| Descriptors | descriptors | list of strings | Optional | vector of 'huv_i' where i = 1,2,3,... |
nond_evidence, in the Methods section of this Reference manual. As an example, in the following specification: interval_uncertain = 2 num_intervals = 3 2 interval_probs = 0.2 0.5 0.3 0.4 0.6 interval_bounds = 2 2.5 4 5 4.5 6 1.0 5.0 3.0 5.0
there are 2 interval uncertain variables. The first one is defined by three intervals, and the second by two intervals. The three intervals for the first variable have basic probability assignments of 0.2, 0.5, and 0.3, respectively, while the basic probability assignments for the two intervals for the second variable are 0.4 and 0.6. The basic probability assignments for each interval variable must sum to one. The interval bounds for the first variable are [2, 2.5], [4, 5], and [4.5, 6]. Note that the lower bound must always come first in the bound pair. Also note that the intervals can be overlapping. The interval bounds for the second variable are [1.0, 5.0] and [3.0, 5.0]. Table 7.16 summarizes the specification details for the interval_uncertain variable.
| Description | Keyword | Associated Data | Status | Default |
| interval uncertain variables | interval_uncertain | integer | Optional group | no interval uncertain variables |
| number of intervals defined for each interval variable | num_intervals | list of integers | Required group | None |
| basic probability assignments per interval | interval_probs | list of reals | Required group. Note that the probabilities per variable must sum to one. | None |
| bounds per interval | interval_bounds | list of reals | Required group. Specify bounds as (lower, upper) per interval, per variable | None |
| Descriptors | descriptors | list of strings | Optional | vector of 'iuv_i' where i = 1,2,3,... |
uncertain_correlation_matrix specification. This specification is generalized in the sense that its specific meaning depends on the nondeterministic method in use. When the method is a nondeterministic sampling method (i.e., nond_sampling), then the correlation matrix specifies rank correlations [Iman and Conover, 1982]. When the method is instead a reliability (i.e., nond_local_reliability or nond_global_reliability) or polynomial chaos (i.e., nond_polynomial_chaos) method, then the correlation matrix specifies correlation coefficients (normalized covariance) [Haldar and Mahadevan, 2000]. In either of these cases, specifying the identity matrix results in uncorrelated uncertain variables (the default). The matrix input should be symmetric and have all
entries where n is the total number of uncertain variables (all normal, lognormal, uniform, loguniform, weibull, and histogram specifications, in that order). Table 7.17 summarizes the specification details:| Description | Keyword | Associated Data | Status | Default |
| correlations in uncertain variables | uncertain_correlation_matrix | list of reals | Optional | identity matrix (uncorrelated) |
| Description | Keyword | Associated Data | Status | Default |
| Continuous state variables | continuous_state | integer | Optional group | No continuous state variables |
| Initial states | initial_state | list of reals | Optional | vector values = 0. |
| Lower bounds | lower_bounds | list of reals | Optional | vector values = -DBL_MAX |
| Upper bounds | upper_bounds | list of reals | Optional | vector values = +DBL_MAX |
| Descriptors | descriptors | list of strings | Optional | vector of 'csv_i' where i = 1,2,3,... |
| Description | Keyword | Associated Data | Status | Default |
| Discrete state variables | discrete_state | integer | Optional group | No discrete state variables |
| Initial states | initial_state | list of integers | Optional | vector values = 0 |
| Lower bounds | lower_bounds | list of integers | Optional | vector values = INT_MIN |
| Upper bounds | upper_bounds | list of integers | Optional | vector values = INT_MAX |
| Descriptors | descriptors | list of strings | Optional | vector of 'dsv_i' where i = 1,2,3,... |
The initial_state specifications define the initial values for the continuous and discrete state variables which will be passed through to the simulator (e.g., in order to define parameterized modeling controls). The lower_bounds and upper_bounds restrict the size of the state parameter space and are frequently used to define a region for design of experiments or parameter study investigations. The descriptors specifications provide strings which will be replicated through the DAKOTA output to help identify the numerical values for these parameters. Default values for optional specifications are zeros for initial states, positive and negative machine limits for upper and lower bounds (+/- DBL_MAX, INT_MAX, INT_MIN from the float.h and limits.h system header files), and numbered strings for descriptors.
1.5.1