# Distribution Functions

## Contents

## Random(*dist, method, Over*)

Random() returns a single uniformly-distributed random number.

Random is not a distribution-function per-se, as Uniform(0, 1) is. However, one often needs access to a random number generator stream, such as for rejection sampling, Metropolis-Hastings simulation, etc. Random() makes it possible to get such values, even if the global sampling method is latin hypercube, and efficiently since it isn't necessary to generate an entire sample.

Random() always returns a random value, even if evaluated in Mid-mode.

Numerous optional parameters provide additional conveniences.

Declaration:

**Random**(dist: optional unevaluated; method: optional scalar; Over:...optional atomic)

Parameters:

- «dist»
- If specified, must be an explicit call to a distribution function that supports single-sample generation (see below). Defaults to Uniform(0, 1).
- «Method»
- Selects the algorithm used to generate the random number.
- Possible value are:
`0`

= use system default`1`

= Minimal standard`2`

= L'Ecuyer`3`

= Knuth- «Over»
- A convenient way to list indexes that independent random numbers will be generated over. (This will also occur if the index(es) occur in any of the other parameters).

Examples:

`Random(Uniform(-100, 100))`

- Returns a single real-valued random number uniformly selected between -100 and 100.

`Random(Uniform(1, 100, integer: true))`

- Returns a random integer between 1 and 100 inclusive.

`Random(Normal(0, 1))`

- Returns a single number from a standard normal distribution.

`Random(Over: I)`

- Returns an array of independent uniform random numbers between 0 and 1 indexed by
`I`

. The numbers are independent (i.e., Monte Carlo sampled, never Latin Hypercube).

- Returns an array of independent uniform random numbers between 0 and 1 indexed by

`Random(Over: I, J)`

- Returns a 2-D array of independent uniform random numbers between 0 and 1, indexed by
`I`

and`J`

. All numbers in the array are sampled independently.

- Returns a 2-D array of independent uniform random numbers between 0 and 1, indexed by

`Random(Uniform(min: Array(I, J, 0), max: 1))`

- This is functionally equivalent to the preceding example. It demonstrates how the «Over» parameter is only a convenience, but results in an easier to interpret syntax.

### Distribution Function Support for Single Samples

In order to support single sample generation, and thus be permitted as a parameter in Random, a distribution function must have a parameter named «singleMethod», usually declared as `singleSampleMethod: optional atomic numeric`

When the parameter is provided, the distribution function must return a single random variate from the distribution indicated by the other parameters. Random will fill in this parameter with one of the following values, indicating which sampling method should be used:

Possible values for «singleMethod»:

`0`

= use default method`1`

= use Minimal standard`2`

= use L'Ecuyer`3`

= use Knuth

As an example, consider what happens when `Random(Normal(2, 3))`

is evaluated. The Random function checks that its parameter is an acceptable distribution function, and then it evaluates:

`Normal(2, 3, singleSampleMethod: 0)`

The Normal function then returns a single random variate from `Normal(2, 3)`

.

Only some built-in functions currently support single variate generation. These include:

User-defined functions can support single-variate generation, and therefore can be used as a parameter to Random, if they have a parameter named «singleMethod».

## Shuffle

Shuffle returns a random permutation of the values in an array along a given index.

The declaration is

**Shuffle**(A: Array[I]; I: IndexType)

If «A» contains dimensions other than «I», each slice of those dimensions will be independently shuffled.

If you wanted to shuffle an array along «I», but have the same shuffling apply to all slices along other dimensions, you could do this:

`Slice(A, I, Shuffle(@I, I))`

## The Over parameter

One of most frequently asked support questions is how to generate independent samples across an index. Some Ph.D. users have reported that they had searched and searched for how to do this, and the need to introduce dimensions into the parameters was not at all obvious.

As a way of exposing this functionality more directly, an optional «Over» parameter has been added to many of the built-in distributions. For example,

`Uniform(0, 1, Over: I, J)`

generates independent uniform distributions for each element combination of `I`

and `J`

. This is equivalent to

`Uniform(Array(I, J, 0),1)`

where the dimensions have been introduced into the parameter, but is a bit clearer and more in line with what people seem to be looking for when they try to figure out on their own how to do this.

The «Over» parameter can easily be added to user-defined distribution functions as well. It is accomplished by simply adding the declaration of the parameter `Over : ... optional atomic`

to the parameter declaration. The parameter is unreferenced in the function body. The atomic modifier causes Analytica's function evaluator to iterate down to the elements of these dimensions.

The «Over» can be conceptualized as a shorthand for "independent over". We should recommend that it is only used as a named parameter, never as a positional parameter. For one, future releases of Analytica might insert new parametric variations into existing functions, and «Over» will tend to always be last (since it is an ellipsis parameter), so a positional usage runs the risk of breaking, while using it in a named form is fine. But also, it looses clarity unless specified in a named fashion.

Any number of indexes can be listed (subject to Analytica's global 15-dimension maximum). In addition, an array can also be supplied, in which case the dimensions of the array determine the independent dimensions.

The «Over» parameter has only been added to built-in distribution functions that have been converted. These functions are currently:

- Uniform
- Normal
- LogNormal
- Gamma
- Beta
- Poisson
- Geometric
- Hypergeometric
- ChiSquared
- StudentT
- Triangular
- Weibull
- Logistic

The following built-in distributions remain to be converted, and hence do not yet support the «Over» parameter (nor do they support named parameters), but will at some point in the future:

- Bernoulli
- Binomial
- Certain -- although it isn't relevant here
- chanceDist
- cumDist
- Exponential
- Fractiles
- Probdist

The «Over» parameter also exists on the Random function. When using the Random function, it is preferable to use «Over» on Random, rather than on the underlying dist, for example:

`Random(Uniform(0, 1), Over: Region)`

or`Random(Over: Region)`

rather than

`Random(Uniform(0,1, Over: Region))`

Using in the second manner results in an inefficiency when other parameters of the Random function contain shared indexes (potentially nesting an iteration over the same index, which becomes quadratic in time); hence, sticking to the first style is safer.

## Uniform(*min, max, integer*)

Generates a random sample from a uniform distribution between «min» and «max».

If «integer» is set to `True`

, it returns only integers in the range. All parameters are optional. If «min» and «max» are omitted, they default to 0 and 1. The effective declaration is now:

**Uniform**(min: numeric = 0; max: numeric = 1; integer: boolean = false; over: ... optional atomic)

As in all distributions, optional parameter «Over» specifies an index, so that `Uniform(over: I)`

generates a table indexed by `I`

of independent random samples for each element of `I`

.

## LogNormal(median, gsdev, mean, *stddev, over*)

**LogNormal**(median, gsdev, mean, stddev: optional positive; over: ... optional atomic)

The LogNormal now supports several parametric variations, as well as an over parameter. Specifically, you can specify a LogNormal using any of the following parametrizations:

**LogNormal**(median: med, gsdev: gs ) or just LogNormal(med, gs)**LogNormal**(median: med, stddev: sd)**LogNormal**(median: med, mean: mu)**LogNormal**(mean: mu, stddev: sd)**LogNormal**(mean: mu, gsdev: gs)**LogNormal**(gsdev: gs, stddev: sd)

If fewer than two parameters are provided, the remainder default to std-LogNormal values (e.g., such that ln(x) is standard normal). If more than two parameters are specified, an error results.

The optional «Over» parameter can be used to specify dimensions over which independent sampling occurs.

## Truncate(ux*, xmin, xmax*)

Returns a distribution with the shape of uncertain quantity «ux», truncated so that it has no values below «xmin» or above «xmax».

In Mid mode, it returns an estimate of the median of the truncated distribution. It always evaluates «ux» probabilistically and «xmin» and «xmax» according to context. The function must be given either «xmin» or «xmax» or both; otherwise, the function will give an evaluation error.

[Lonnie, I suggest that in Mid mode, it should evaluate **ux** in Mid mode, and return
Max(xmin, Min(Mid(ux), xmax)). Generally, only stat functions shd require probabilistic computation in[Mid mode.]

Special cases:

If all values of «ux» ≤ «xmin», it returns a sample = «xmin». Similarly, if all values of «ux» ≥ «xmax», it returns a sample = «xmax».

It flags an evaluation error if «xmin» > «xmax».

Truncate() "semi-preserves" the rank-order of sample «ux»: Given `Y = Truncate(X, xmin)`

, then `X[Run = i] < X[Run = j] ==> Y[Run = i] ≤ Y[Run = j]`

. Hence, if all values of `X`

are unique, the ranks are preserved -- i.e. the ranks of sample `Y`

will correspond to the ranks of sample `X`

. If `X`

contains some repeated values, the ranks may not be all preserved.

**Truncate**(ux: numeric sample; xmin, xmax: optional scalar)

## History

Introduced in Analytica 4.0.

Enable comment auto-refresher