User:Lchrisman/JQPN Cowlick

< User:Lchrisman
Revision as of 23:54, 31 December 2019 by Lchrisman (Talk | contribs)

This page has notes by Lonnie on an oddity that I've named the "Cowlick" of the fully-bounded Johnson QPN distribution from the paper:

This oddity impacts the implementation of DensUncertainLMH, where I was hoping to use their algorithm but ran into this glitch. Other than this, I am enthusiastic about their paper and am hoping to use their algorithm. But to be usable as a robust built-in function in Analytica, I feel like I need to find a solution to this problem first. So this page contains my notes.

The cowlick refers to an artifact in their fully-bounded distribution in which the density goes to infinity at the upper bound when n=1 (when B < ave(L,H)), or goes to infinity at the lower bound when n=-1 (when B>ave(L,H)). This is an undesirable trait, and causes there to be what is essentially a step in the CDF at the upper bound, as if the continuous part of the CDF doesn't really go to the upper bound. It isn't a step in a strict sense (it is continuous), but it flicks up so fast at the end that it acts like a step.

The figure here is the cowlick in the illustrative example from their paper, where you can see the density jumping to infinity at the upper bound. Cowlick graph.png This graph is the same as the graph that appears in their paper as Figure 7b, where you can make out this phenomena if you look carefully. Here is their graph with my arrow pointing out the start of the cowlick. Hadlock bickel fig7b.png

The CDF for the distribution parameterized by (-11, -10, -9.99, -9, -8) is shown next, where the cowlick causes a "step" to occur in the CDF at the upper bound. It isn't quite the smooth CDF I had hoped for. Cowlick CDF step.png

The paper doesn't acknowledge this oddity, but it can be detected in their Figure 7b at the upper bound. The oddity does not happen with the unbounded and semi-bounded cases.

Cowlick ubiquity

Although the cowlick was quite obvious after I had implemented the algorithm, I found it to be non-trivial to prove that it happens. Here I'll prove that the cowlick always happens except in the very special and unusual case where $ B = (L+H)/2 $. L, B and H are the probit-transformed values for the 10th, 50th and 95th percentiles as defined in the paper. Loosely speaking, this is the case where the median is closer to the 10th percentile than to the 90th percentile.

Theorem: In the fully-bounded Hadlock & Bickel JQPN distribution, when $ B < (L+H)/2 $, the probability density $ f_B $ at the upper bound $ u $ approaches infinity -- i.e.,
$ \lim_{x\rarr u} f_B(x) = \infty $
and when $ B>(L+H)/2 $, the probability density at the lower bound $ l $ approaches infinity
$ \lim_{x\rarr l} f_B(x) = \infty $
where $ f_B(x) = {{\partial F_B(x)}\over{\partial x}} $ and $ F_B(x) $ is the cumulative probability function given in Eq (8) on Page 42 of the paper.

Proof of the theorem

In the paper, the quantile function $ Q_B(p) $ is the inverse of the cumulative probability function $ F_B(x) $, where $ x=l $ corresponds to $ p=0 $ and $ x=u $ occurs when $ p=1 $. Hence, the derivative of $ F_B(x) $ is the reciprocal of $ Q_B(p) $ at the corresponding $ p $. Since the equations are a little simpler for the derivative of $ Q_B $, what I'll actually be showing is that when $ B<(H+L)/2 $

$ \lim_{p->1} {{\partial Q_B(p)}\over{\partial p}} = 0 $

which, one shown, will establish that $ f_B(u) $ approaches $ \infty $.

$ Q_B $ is defined in Eq (7) of the paper as

$ Q_B(p) = l + (u-l) \Phi\left( \xi+ \lambda \sinh( \delta (\Phi^{-1}(p) + n c)) \right) $

where $ l, u $ are the specified lower and upper bounds for the bounded distribution, $ \Phi(x) $ is the standard normal cumulative probability function, and $ \xi, \lambda, \delta, n $ and $ c $ are defined in the paper as a function of the distribution parameters.

From their definitions, $ c>0, \delta>0, \lambda>0 $ always. When $ B<(L+H)/2 $ as assumed by the theorem, by definition $ n=1 $. The proof does not require more precision that this. (For the proof of the lower bound when $ B>(L+H)/2 $, $ n=-1 $).

To simplify notation, I make this variable substitution:

$ z = \delta( \Phi^{-1}(p) + n c) $


$ Q_B(z) = l + (u-l) \Phi\left( \xi+ \lambda \sinh( z ) \right) $


$ {{\partial Q_B(p)}\over{\partial p}} = {{\partial Q_B(z)}\over{\partial z}} {{\partial z}\over{\partial p}} = {{\partial Q_B(z)}\over{\partial z}} \sqrt{2\pi} \delta e^{{1\over 2} \Phi^{-1}(p)^2 } = {{\partial Q_B(z)}\over{\partial z}} \sqrt{2\pi} \delta e^{{1\over 2}(z/\delta - n c)^2} $

where $ x\rarr u $ corresponds to $ p\rarr 1 $ and $ z \rarr \infty $, so

$ \lim_{p->1} {{\partial Q_B(p)}\over{\partial p}} = \lim_{z\rarr \infty} \sqrt{2\pi} \delta e^{{1\over 2} (z/\delta - n c)^2} {{\partial Q_B(z)}\over{\partial z}} $


$ \lim_{p->} {{\partial Q_B(p)}\over{\partial p}} = \lim_{z\rarr \infty} \sqrt{2\pi} \delta e^{{1\over 2}(z/\delta - n c)^2} {{\partial Q_B(z)}\over{\partial z}} $

and we need to show that these approach 0.

Take the derivative (I used Wolfram Alpha's online derivative calculator):

$ {{\partial Q_B(z)}\over{\partial z}} = {{(u-l) \lambda}\over{2 \sqrt{2 \pi}}} \left( e^z + e^{-z} \right) e^{-{1\over 2} \left( {1\over 2} \lambda \left(e^z - e^{-z}\right) + \xi\right)^2} $

Expand the limit

$ \begin{array}{rcl} \lim_{p\rarr 1} {{\partial Q_B(p)}\over{\partial p}} &=& \lim_{z\rarr \infty} \sqrt{2\pi} \delta e^{{1\over 2}(z/\delta- n c)^2} {{\partial Q_B(z)}\over{\partial z}} \\ &=& \lim_{z\rarr \infty} \sqrt{2\pi} \delta e^{{1\over 2}(z/\delta - n c)^2} {{(u-l) \lambda}\over{2 \sqrt{2 \pi}}} \left( e^z + e^{-z} \right) e^{-{1\over 2} \left( {1\over 2} \lambda \left(e^z - e^{-z}\right) + \xi\right)^2} \\ &=& {1\over 2} (u-l)\delta \lambda \left( \lim_{z\rarr \infty} e^z + e^{-z} \right) \left( \lim_{z\rarr \infty} e^{{1\over 2} (z/\delta- n c)^2} \right) \left( \lim_{z\rarr \infty} e^{-{1\over 2} \left( {1\over 2} \lambda \left( e^z - e^{-z}\right) + \xi\right)^2} \right) \\ &=& {1\over 2} (u-l)\delta \lambda \left( \lim_{z\rarr \infty} e^z \right) \left( \lim_{z\rarr \infty} e^{{1\over 2} \delta^{-2} z^2} \right) \left( \lim_{z\rarr \infty} e^{-{1\over 4} \lambda^2 {e^z}^2} \right) \\ &=& {1\over 2} (u-l)\delta \lambda \lim_{z\rarr \infty} e^z e^{{1\over 2} \delta^{-2} z^2} e^{-{1\over 4} \lambda^2 e^{z^2}} \\ &=& 0 \end{array} $

Last line follows because the $ e^{-{1\over4} \lambda^2 e^{z^2}} $ term goes to zero and dominates both $ e^z $ and $ e^{{1\over 2}\delta^{-2} z^2} $.

Note: I did not use the fact that $ n=1 $, so the same proof works for $ z \rarr -\infty $, which if correct would mean that the cowlick appears on both bounds always. However, this is not what I observe. So something is not quite right yet.