Difference between revisions of "Log-normal distribution"

(categories)
(Moved all 4 functions to this page)
Line 2: Line 2:
 
[[category:Semi-bounded distributions]]
 
[[category:Semi-bounded distributions]]
 
[[category:Unimodal distributions]]
 
[[category:Unimodal distributions]]
 
 
[[Category:Doc Status C]] <!-- For Lumina use, do not change -->
 
[[Category:Doc Status C]] <!-- For Lumina use, do not change -->
 
 
[[Category:Ana: Status R]]  <!-- For Lumina use, do not change -->
 
[[Category:Ana: Status R]]  <!-- For Lumina use, do not change -->
 +
{{ReleaseBar}}
 +
 +
A [[Log-normal distribution]] is a [[:category:Continuous distributions|continuous distribution]] whose [[Ln|logarithm]] is [[Normal distribution|normally distributed]]. In other words, <code>[[Ln]](x)</code> has a [[Normal distribution]] when <code>x</code> has a log-normal distribution.
 +
 +
<center><code>LogNormal(median:3,stddev:2)</code> &rarr; [[image:LogNormal(median=3,stddev=2).png]]</center>
 +
 +
[[Log-normal distribution]]s are useful for many quantities that are always positive and have long upper tails, such as concentration of a pollutant, or amount of rainfall.  The distribution is [[:category:Semi-bounded distributions|semi-bounded]] (positive-only) and [[:category:Unimodal distributions|unimodal]], and often has a long right tail.
 +
 +
The central limit theorem says that the product of a long series of independent and identically distributed positive random variables converges to a log-normal distribution for any positive, finite-variance distribution.
 +
 +
== Functions ==
 +
 +
The log-normal is specified by specifying any two of the following four parameters.
 +
; median
 +
The [[Median]], must be >0.
 +
; gsdev
 +
The geometric standard deviation>=1.
 +
; mean
 +
The arithmetic [[Mean]], >0
 +
; stddev
 +
The arithmetic [[SDeviation|standard deviation]], >=0.
 +
 +
A named-parameter convention is recommended, such as:
 +
:<code>LogNormal( gsdev:1.5, mean: 4 )</code>
 +
 +
=== LogNormal(''median, gsdev, mean, stddev, over'') ===
 +
The distribution function. Use this to specify that a chance variable or uncertain quantity is log-normally distributed. You must specify exactly two of the core parameters.
 +
 +
To create independent and identically distributed log-normal distributions along one or more indexes, specify those indexes using the optional «over» parameter.
  
== LogNormal(median, gsdev, mean, stddev) ==
 
 
Generates a sample with a lognormal distribution given «median» and «gsdev» (geometric standard deviation),  or «mean» and «stddev» (standard deviation).
 
Generates a sample with a lognormal distribution given «median» and «gsdev» (geometric standard deviation),  or «mean» and «stddev» (standard deviation).
  
The logarithm of a lognormal random variable has a normal distribution. Lognormal distributions are useful for many quantities that are always positive and have long upper tails, such as concentration of a pollutant, or amount of rainfall.
+
=== <div id="DensLogNormal">Dens{{Release||4.6|_}}LogNormal( x'', median, gsdev, mean, stddev'' )</div> ===
 +
{{Release||4.6|To use this, you need to add the [[Distribution Densities Library]] to your model. }}
 +
 
 +
The analytic probability density function. Returns the probability density at «x». Exactly two of the parameters «median», «gsdev», «mean», or «stddev» must be provided.
 +
 
 +
=== <div id="CumLogNormal">CumLogNormal( x'', median, gsdev, mean, stddev'' )</div> ===
 +
{{Release||4.6|To use this, you need to add the [[Distribution Densities Library]] to your model. }}
 +
 
 +
The analytic cumulative density function. Returns the probability that the outcome is less than or equal to «x».
 +
 
 +
Exactly two of the parameters «median», «gsdev», «mean», or «stddev» must be provided.
 +
 
 +
=== <div id="CumLogNormalInv">CumLogNormalInv( p'', median, gsdev, mean, stddev'' )</div> ===
 +
{{Release||4.6|To use this, you need to add the [[Distribution Densities Library]] to your model. }}
 +
 
 +
The inverse cumulative density function (aka quantile function). Returns the «p»<sup>th</sup> fractile/quantile/percentile.
 +
 
 +
Exactly two of the parameters «median», «gsdev», «mean», or «stddev» must be provided.
 +
 
 +
== Statistics ==
  
 +
== Examples ==
 
A [[Normal]] distribution is symmetric around its [[mean]]:  
 
A [[Normal]] distribution is symmetric around its [[mean]]:  
 
:If <code>x := Normal(mean, sdev)</code>, then <code>P(x <= mean - sdev) = P(x >= mean + sdev) = .15</code>.  
 
:If <code>x := Normal(mean, sdev)</code>, then <code>P(x <= mean - sdev) = P(x >= mean + sdev) = .15</code>.  
Line 53: Line 99:
  
 
== See Also ==
 
== See Also ==
* [[Dens_LogNormal]]
+
* [[Normal distribution]]
* [[CumLogNormal]]
+
* [[Gamma distribution]]  -- similar shaped distribution  
* [[Normal]]
+
* [[Gamma]]  -- similar shaped distribution  
+
 
* [[Parametric continuous distributions]]
 
* [[Parametric continuous distributions]]
 
* [[Distribution Functions]]
 
* [[Distribution Functions]]

Revision as of 00:08, 11 October 2018



Release:

4.6  •  5.0  •  5.1  •  5.2  •  5.3  •  5.4  •  6.0


A Log-normal distribution is a continuous distribution whose logarithm is normally distributed. In other words, Ln(x) has a Normal distribution when x has a log-normal distribution.

LogNormal(median:3,stddev:2)LogNormal(median=3,stddev=2).png

Log-normal distributions are useful for many quantities that are always positive and have long upper tails, such as concentration of a pollutant, or amount of rainfall. The distribution is semi-bounded (positive-only) and unimodal, and often has a long right tail.

The central limit theorem says that the product of a long series of independent and identically distributed positive random variables converges to a log-normal distribution for any positive, finite-variance distribution.

Functions

The log-normal is specified by specifying any two of the following four parameters.

median

The Median, must be >0.

gsdev

The geometric standard deviation>=1.

mean

The arithmetic Mean, >0

stddev

The arithmetic standard deviation, >=0.

A named-parameter convention is recommended, such as:

LogNormal( gsdev:1.5, mean: 4 )

LogNormal(median, gsdev, mean, stddev, over)

The distribution function. Use this to specify that a chance variable or uncertain quantity is log-normally distributed. You must specify exactly two of the core parameters.

To create independent and identically distributed log-normal distributions along one or more indexes, specify those indexes using the optional «over» parameter.

Generates a sample with a lognormal distribution given «median» and «gsdev» (geometric standard deviation), or «mean» and «stddev» (standard deviation).

DensLogNormal( x, median, gsdev, mean, stddev )

The analytic probability density function. Returns the probability density at «x». Exactly two of the parameters «median», «gsdev», «mean», or «stddev» must be provided.

CumLogNormal( x, median, gsdev, mean, stddev )

The analytic cumulative density function. Returns the probability that the outcome is less than or equal to «x».

Exactly two of the parameters «median», «gsdev», «mean», or «stddev» must be provided.

CumLogNormalInv( p, median, gsdev, mean, stddev )

The inverse cumulative density function (aka quantile function). Returns the «p»th fractile/quantile/percentile.

Exactly two of the parameters «median», «gsdev», «mean», or «stddev» must be provided.

Statistics

Examples

A Normal distribution is symmetric around its mean:

If x := Normal(mean, sdev), then P(x <= mean - sdev) = P(x >= mean + sdev) = .15.

Analogously, a lognormal distribution is ratio-symmetric around its median:

If y := LogNormal(median, gsdev), then P(y <= median/gsdev) = P(y >= median*gsdev) = .15.

If you specify no parameters, it defaults to standard lognormal -- i.e. whose natural logarithm is a unit normal, mean 0 and standard deviation 1.

You can actually specify any two of the four parameters, from which it can compute the other two:

LogNormal(median: med, gsdev: gs) or just LogNormal(med, gs)
LogNormal(median: med, stddev: sd)
LogNormal(median: med, mean: mu)
LogNormal(mean: mu, stddev: s)
LogNormal(mean: mu, gsdev: gs)
LogNormal(gsdev: gs, stddev: sd)

If you specify more than two parameters, it will give an error.

Like other distributions, you can also give one or more «Over» indexes. These cause it to generate an array of independent lognormal distributions over the specified index(es). For example,

LogNormal(m, gsd, Over: i)

Syntax:

LogNormal(median, gsdev, mean, stddev: Optional Positive; over: ... Optional Atom)

Parameter Estimation

Suppose X contains sampled historical data indexed by I, and consisting solely of positive values. To estimate the parameters of the best-fit LogNormal distribution, the following parameter estimation formulae can be used:

«median» := Median(X, I) or Exp(Mean(Ln(X), I))
«gsdev» := Exp(SDeviation(Ln(X), I))

A more general form, with one extra degree-of-freedom, is the LogNormal with an offset, i.e.,:

LogNormal(median, gsdev) - offset

The more general form can be adapted to data sets containing negative numbers. The offset is constrained so that

offset > -Min(X, I)

To my knowledge, a closed form formula for offset does not exist, so that finding the optimal value of offset requires a 1-D search or optimization. However, I have found that the following heuristic estimation formulae comes extremely close to the best-fit parameters with offset:

offset := -Min(X, I) + 2*(Median(X, I) - Min(X, I))/Sum(1, I)
median := Median(X + offset, I)
gsdev  := Exp(SDeviation(Ln(X + offset), I))

See Also

Comments


You are not allowed to post comments.