Log-normal distribution

Revision as of 15:54, 5 August 2009 by Lchrisman (Talk | contribs) (See Also)

LogNormal(median, gsdev)

Generates a sample with a lognormal distribution with given median and gsdev (geometric standard deviation). The logarithm of a lognormal random variable has a normal distribution.

A normal distribution is symmetric around its mean: If x := Normal(mean, sdev), then P(x <= mean - sdev) = P(x >= mean + sdev) = .15. Analogously, a lognormal distribution is ratio-symmetric around its median: If y := LogNormal(median, gsdev), then P(y <= median/gsdev) = P(y >= median*gsdev) = .15.

Lognormal actually has four parameters, median, gsdev (geometric standard deviation), mean, stddev (standard deviation). You can specify any two of them, which are sufficient to specify the rest.

LogNormal(median: med, gsdev: gs)  or just LogNormal(med, gs)
LogNormal(median: med, stddev: sd)
LogNormal(median: med, mean: mu)
LogNormal(mean: mu, stddev: s)
LogNormal(mean: mu, gsdev: gs )
LogNormal(gsdev: gs, stddev: sd)

If you specify more than two parameters, it will give an error. If you specify no parameters, it will default to standard lognormal -- i.e. whose natural logarithm is a unit normal, mean 0 and standard deviation 1.

Like other distributions, you can also give one or more Over: indexes. These cause it to generate an array of independent lognormal distributions over the specified index(es). For example,

 LogNormal(m, gsd, Over: i)


LogNormal(median, gsdev, mean, stddev: Optional Positive; over: ... Optional Atom)

Parameter Estimation

Suppose X contains sampled historical data indexed by I, and consisting solely of positive values. To estimate the parameters of the best-fit LogNormal distribution, the following parameter estimation formulae can be used:

median := Median(X,I)
or, := Exp(Mean(Ln(X),I))
gsdev := Exp(SDeviation(Ln(X),I))

A more general form, with one extra degree-of-freedom, is the LogNormal with an offset, i.e.,:

LogNormal(median,gsdev) - offset

The more general form can be adapted to data sets containing negative numbers. The offset is constrained so that

offset > -Min(X,I)

To my knowledge, a closed form formula for offset does not exist, so that finding the optimal value of offset requires a 1-D search or optimization. However, I have found that the following heuristic estimation formulae comes extremely close to the best-fit parameters with offset:

offset := -Min(X,I) + 2*(Median(X,I) - Min(X,I)) / Sum(1,I)
median := Median(X+offset,I)
gsdev  := Exp(SDeviation(Ln(X+offset),I))

See Also


You are not allowed to post comments.