## Digest: Neyman Pearson Lemma

### October 29, 2008

We all know about Neyman-Pearson, Lehmann & Romano (2005) provided a better illustration of it: value per dollar and miles per hour.
which means, for fixed type-I error $\alpha = \int_{\Omega_{H_0}}\phi(x)d\mathscr P_\theta(x)$
we want the maximum of power (1 – type II error):
$\beta = \int_{\Omega_{H_0^c}}\phi(x)d\mathscr P_\theta(x)$
where $\phi(x)$ denotes the probability of desired outcome occurs in random experiment $R$.
thus, what we want is
$r(x) = \frac{\beta}{\alpha} = \frac{\int_{\Omega_{H_0^c}}\phi(x)d\mathscr P_\theta(x)}{\int_{\Omega_{H_0}}\phi(x)d\mathscr P_\theta(x)}$
to be maximized.

## Yet another Linear model post 20081022

### October 22, 2008

$(\boldsymbol x – \boldsymbol a)’\mathbf A(\boldsymbol x – \boldsymbol a) + (\boldsymbol x – \boldsymbol b)’\mathbf B(\boldsymbol x – \boldsymbol b) = \boldsymbol x'(\mathbf A + \mathbf B)\boldsymbol x – \boldsymbol x'(\mathbf A\boldsymbol a + \mathbf B\boldsymbol b) – (\mathbf A\boldsymbol a + \mathbf B\boldsymbol b)’\boldsymbol x + \boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf B\boldsymbol b$
$= \boldsymbol x'(\mathbf A + \mathbf B)\boldsymbol x – \boldsymbol x'(\mathbf A + \mathbf B)(\mathbf A + \mathbf B)^{-1}(\mathbf A\boldsymbol a + \mathbf B\boldsymbol b) – (\mathbf A\boldsymbol a + \mathbf B\boldsymbol b)’ (\mathbf A + \mathbf B)^{-1}(\mathbf A + \mathbf B)\boldsymbol x$
$+ \boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf B\boldsymbol b$
$= \boldsymbol x'(\mathbf A + \mathbf B)\boldsymbol x – \boldsymbol x'(\mathbf A + \mathbf B)\boldsymbol c – \boldsymbol c'(\mathbf A + \mathbf B)\boldsymbol x + \boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf B\boldsymbol b$

$= (\boldsymbol x – \boldsymbol c)'(\mathbf A + \mathbf B)(\boldsymbol x – \boldsymbol c) – \boldsymbol c'(\mathbf A + \mathbf B)\boldsymbol c + \boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf B\boldsymbol b$
and
$\boldsymbol c'(\mathbf A + \mathbf B)\boldsymbol c = (\mathbf A\boldsymbol a – \mathbf B\boldsymbol a)'(\mathbf A + \mathbf B)^{-1}(\mathbf A\boldsymbol a – \mathbf B\boldsymbol a) = [\mathbf A(\boldsymbol a – \boldsymbol b) + (\mathbf B + \mathbf A)\boldsymbol b]'(\mathbf A + \mathbf B)^{-1} [(\mathbf A + \mathbf B)\boldsymbol a – \mathbf B(\boldsymbol a – \boldsymbol b)]$

$= (\boldsymbol a – \boldsymbol b)’\mathbf A\boldsymbol a + \boldsymbol b'(\mathbf A + \mathbf B)\boldsymbol a – \boldsymbol b’\mathbf B(\boldsymbol a – \boldsymbol b) – (\boldsymbol a – \boldsymbol b)’\mathbf A(\mathbf A + \mathbf B)^{-1}\mathbf B(\boldsymbol a – \boldsymbol b)$
$= \boldsymbol a’\mathbf A\boldsymbol a – \boldsymbol b’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf B\boldsymbol a – \boldsymbol b’\mathbf B\boldsymbol a + \boldsymbol b’\mathbf B\boldsymbol b – (\boldsymbol a – \boldsymbol b)’\mathbf A(\mathbf A + \mathbf B)^{-1}\mathbf B(\boldsymbol a – \boldsymbol b)$
thus,
$\boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf B\boldsymbol b – \boldsymbol c'(\mathbf A + \mathbf B)\boldsymbol c = \boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf B\boldsymbol b – \boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf A\boldsymbol a – \boldsymbol b’\mathbf A\boldsymbol a – \boldsymbol b’\mathbf B\boldsymbol a + \boldsymbol b’\mathbf B\boldsymbol a – \boldsymbol b’\mathbf B\boldsymbol b + (\boldsymbol a – \boldsymbol b)’\mathbf A(\mathbf A + \mathbf B)^{-1}\mathbf B(\boldsymbol a – \boldsymbol b)$

$= (\boldsymbol a – \boldsymbol b)’\mathbf A(\mathbf A + \mathbf B)^{-1}\mathbf B(\boldsymbol a – \boldsymbol b)$
thus,
$(\boldsymbol x – \boldsymbol a)’\mathbf A(\boldsymbol x – \boldsymbol a) + (\boldsymbol x – \boldsymbol b)’\mathbf B(\boldsymbol x – \boldsymbol b) = (\boldsymbol x – \boldsymbol c)'(\mathbf A + \mathbf B)(\boldsymbol x – \boldsymbol c) + (\boldsymbol a – \boldsymbol b)’\mathbf A(\mathbf A + \mathbf B)^{-1}\mathbf B(\boldsymbol a – \boldsymbol b)$

## Yet another Linear model post 20081022

### October 22, 2008

$(\boldsymbol x – \boldsymbol a)’\mathbf A(\boldsymbol x – \boldsymbol a) + (\boldsymbol x – \boldsymbol b)’\mathbf B(\boldsymbol x – \boldsymbol b) = \boldsymbol x'(\mathbf A + \mathbf B)\boldsymbol x – \boldsymbol x'(\mathbf A\boldsymbol a + \mathbf B\boldsymbol b) – (\mathbf A\boldsymbol a + \mathbf B\boldsymbol b)’\boldsymbol x + \boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf B\boldsymbol b$
$= \boldsymbol x'(\mathbf A + \mathbf B)\boldsymbol x – \boldsymbol x'(\mathbf A + \mathbf B)(\mathbf A + \mathbf B)^{-1}(\mathbf A\boldsymbol a + \mathbf B\boldsymbol b) – (\mathbf A\boldsymbol a + \mathbf B\boldsymbol b)’ (\mathbf A + \mathbf B)^{-1}(\mathbf A + \mathbf B)\boldsymbol x$
$+ \boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf B\boldsymbol b$
$= \boldsymbol x'(\mathbf A + \mathbf B)\boldsymbol x – \boldsymbol x'(\mathbf A + \mathbf B)\boldsymbol c – \boldsymbol c'(\mathbf A + \mathbf B)\boldsymbol x + \boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf B\boldsymbol b$

$= (\boldsymbol x – \boldsymbol c)'(\mathbf A + \mathbf B)(\boldsymbol x – \boldsymbol c) – \boldsymbol c'(\mathbf A + \mathbf B)\boldsymbol c + \boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf B\boldsymbol b$
and
$\boldsymbol c'(\mathbf A + \mathbf B)\boldsymbol c = (\mathbf A\boldsymbol a – \mathbf B\boldsymbol a)'(\mathbf A + \mathbf B)^{-1}(\mathbf A\boldsymbol a – \mathbf B\boldsymbol a) = [\mathbf A(\boldsymbol a – \boldsymbol b) + (\mathbf B + \mathbf A)\boldsymbol b]'(\mathbf A + \mathbf B)^{-1} [(\mathbf A + \mathbf B)\boldsymbol a – \mathbf B(\boldsymbol a – \boldsymbol b)]$

$= (\boldsymbol a – \boldsymbol b)’\mathbf A\boldsymbol a + \boldsymbol b'(\mathbf A + \mathbf B)\boldsymbol a – \boldsymbol b’\mathbf B(\boldsymbol a – \boldsymbol b) – (\boldsymbol a – \boldsymbol b)’\mathbf A(\mathbf A + \mathbf B)^{-1}\mathbf B(\boldsymbol a – \boldsymbol b)$
$= \boldsymbol a’\mathbf A\boldsymbol a – \boldsymbol b’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf B\boldsymbol a – \boldsymbol b’\mathbf B\boldsymbol a + \boldsymbol b’\mathbf B\boldsymbol b – (\boldsymbol a – \boldsymbol b)’\mathbf A(\mathbf A + \mathbf B)^{-1}\mathbf B(\boldsymbol a – \boldsymbol b)$
thus,
$\boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf B\boldsymbol b – \boldsymbol c'(\mathbf A + \mathbf B)\boldsymbol c = \boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf B\boldsymbol b – \boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf A\boldsymbol a – \boldsymbol b’\mathbf A\boldsymbol a – \boldsymbol b’\mathbf B\boldsymbol a + \boldsymbol b’\mathbf B\boldsymbol a – \boldsymbol b’\mathbf B\boldsymbol b + (\boldsymbol a – \boldsymbol b)’\mathbf A(\mathbf A + \mathbf B)^{-1}\mathbf B(\boldsymbol a – \boldsymbol b)$

$= (\boldsymbol a – \boldsymbol b)’\mathbf A(\mathbf A + \mathbf B)^{-1}\mathbf B(\boldsymbol a – \boldsymbol b)$
thus,
$(\boldsymbol x – \boldsymbol a)’\mathbf A(\boldsymbol x – \boldsymbol a) + (\boldsymbol x – \boldsymbol b)’\mathbf B(\boldsymbol x – \boldsymbol b) = (\boldsymbol x – \boldsymbol c)'(\mathbf A + \mathbf B)(\boldsymbol x – \boldsymbol c) + (\boldsymbol a – \boldsymbol b)’\mathbf A(\mathbf A + \mathbf B)^{-1}\mathbf B(\boldsymbol a – \boldsymbol b)$

## Conjugate Families

### October 22, 2008

• Multinomial-Dirichlet:
Let $\boldsymbol x = (x_1, x_2,\ldots, x_k)’ \sim MN(n,\boldsymbol p)$ and $\boldsymbol p = (p_1, p_2, \ldots, p_k)\sim Dirichlet(\boldsymbol\alpha)$ where
$\boldsymbol\alpha = (\alpha_1, \alpha_2, \ldots, \alpha_k)$, and for all
$\alpha_i$’s are known. We have the
full joint denisty function that
$f(\boldsymbol x, \boldsymbol p |\boldsymbol\alpha, n) = n!\prod_{i=1}^k\frac{p^{x_i}}{x_i!}\Gamma\left(\sum_{i=1}^k\alpha_i\right)\prod_{i=1}^k\frac{p^{\alpha_1 -1}}{\Gamma(\alpha_i)} = n!\Gamma\left(\sum_{i=1}^k\alpha_i\right) \prod_{i=1}^k\frac{p^{x_i}p^{\alpha_1 -1}}{x_i!\Gamma(\alpha_i)} = n!\Gamma\left(\sum_{i=1}^k\alpha_i\right) \prod_{i=1}^k\frac{p^{x_i+\alpha_1 -1}}{x_i!\Gamma(\alpha_i)},$
thus, we have
$f(\boldsymbol x |n, \boldsymbol\alpha) = \int_{\mathcal X_{\boldsymbol p}} f(\boldsymbol x,\boldsymbol p|\boldsymbol\alpha, n) d\boldsymbol p = \int_{\mathcal X_{\boldsymbol p}} n!\Gamma\left(\sum_{i=1}^k\alpha_i\right) \prod_{i=1}^k\frac{p^{x_i+\alpha_1 -1}}{x_i!\Gamma(\alpha_i)}d\boldsymbol p = \frac{n!\Gamma\left(\sum_{i=1}^k\alpha_i\right)}{\Gamma\left(\sum_{i=1}^k\alpha_i + x_i\right)} \prod_{i=1}^k\frac{\Gamma(\alpha_i+x_i)}{x_i!\Gamma(\alpha_i)},$
thus,
$f(\boldsymbol p|\boldsymbol x, \boldsymbol\alpha, n) = \frac{f(\boldsymbol p,\boldsymbol x|\boldsymbol\alpha, n)}{f(\boldsymbol x|\boldsymbol\alpha, n)} = \Gamma\left(\sum_{i=1}^k\alpha_i+x_i\right)\prod_{i=1}^k\frac{p^{\alpha_1 + x_i -1}}{\Gamma(\alpha_i+x_i)},$
that is, the parameters of a multinomial distribution has a
conjugate pair in the form of Dirichlet distribution.
• Multivariate Normal – Multivariate Normal:
Let $\boldsymbol x\sim\mathscr N_p(\boldsymbol\mu, \mathbf\Sigma)$ and
$\boldsymbol\mu\sim\mathscr N_p(\boldsymbol\xi, \mathbf\Omega)$ where $\boldsymbol\xi, \mathbf\Sigma, \mathbf\Omega$ are all known; then we have the
joint density function
$f(\boldsymbol x,\boldsymbol\mu|\boldsymbol\xi, \mathbf\Sigma,\mathbf\Omega) = (2\pi)^{-p}|\mathbf\Sigma|^{-\frac{1}{2}}|\mathbf\Omega|^{-\frac{1}{2}}\exp\left\{-\frac{1}{2}\left[(\boldsymbol x- \boldsymbol\mu)’\mathbf\Sigma^{-1}(\boldsymbol x- \boldsymbol\mu) + (\boldsymbol \mu- \boldsymbol\xi)’\mathbf\Omega^{-1}(\boldsymbol \mu- \boldsymbol\xi)\right]\right\},$
and the marinal distribution of $\boldsymbol x$ is
$f(\boldsymbol x|\boldsymbol\xi,\mathbf\Sigma, \mathbf\Omega) = \int_{\mathcal X_{\boldsymbol\mu}}(2\pi)^{-p}|\mathbf\Sigma|^{-\frac{1}{2}}|\mathbf\Omega|^{-\frac{1}{2}}\exp\left\{-\frac{1}{2}\left[(\boldsymbol x- \boldsymbol\mu)’\mathbf\Sigma^{-1}(\boldsymbol x- \boldsymbol\mu) + (\boldsymbol \mu- \boldsymbol\xi)’\mathbf\Omega^{-1}(\boldsymbol \mu- \boldsymbol\xi)\right]\right\}d\boldsymbol\mu$
$= (2\pi)^{-\frac{p}{2}}|\mathbf\Theta|^{\frac{1}{2}}|\mathbf\Sigma|^{-\frac{1}{2}}|\mathbf\Omega| ^{-\frac{1}{2}}\exp\left\{-\frac{1}{2}\left[\boldsymbol x’\mathbf\Sigma^{-1}\boldsymbol x +\boldsymbol\xi’\mathbf\Omega^{-1}\boldsymbol\xi – (\mathbf\Omega^{-1}\boldsymbol \xi + \mathbf\Sigma^{-1}\boldsymbol x)’\mathbf\Theta(\mathbf\Omega^{-1}\boldsymbol \xi + \mathbf\Sigma^{-1}\boldsymbol x)\right]\right\}\times$
$\int_{\mathcal X_{\boldsymbol\mu}} (2\pi)^{-\frac{p}{2}}|\mathbf\Theta|^{-\frac{1}{2}}\exp\left\{-\frac{1}{2}(\boldsymbol\mu – \boldsymbol\eta)’\mathbf\Theta^{-1}(\boldsymbol\mu – \boldsymbol\eta)\right\}d\boldsymbol\mu$
where $\mathbf\Theta = \left(\mathbf\Sigma^{-1} + \mathbf\Omega^{-1}\right)^{-1}$, $\boldsymbol\eta = \mathbf\Theta\left(\mathbf\Omega^{-1}\boldsymbol \xi + \mathbf\Sigma^{-1}\boldsymbol x\right)$, thus, we have
$f(\boldsymbol\mu|\boldsymbol x,\mathbf\Sigma,\mathbf\Omega) = \frac{f(\boldsymbol\mu, \boldsymbol x|\mathbf\Sigma,\mathbf\Omega)}{f(\boldsymbol x|\mathbf\Sigma,\mathbf\Omega)} = (2\pi)^{-\frac{p}{2}}|\mathbf\Theta|^{-\frac{1}{2}}\exp\left\{-\frac{1}{2}(\boldsymbol\mu – \boldsymbol\eta)’\mathbf\Theta^{-1}(\boldsymbol\mu – \boldsymbol\eta)\right\}$
which is in the form of $\mathscr N_p(\boldsymbol\eta,\mathbf\Theta)$.
Then we have the mean of a multivariate normal distribution has a
conjugate pair in prior and posterior.
• Mutlivariate Normal – Inverse Wishart:
Let $\boldsymbol y\sim\mathscr N_p(\boldsymbol\mu, \mathbf\Sigma)$ and $\mathbf\Sigma\sim \mathscr W_p^{-1}(\mathbf\Psi^{-1}, m)$, where $\boldsymbol\mu, \mathbf\Phi, m$ are all known. We have a joint density function
$f(\boldsymbol x,\mathbf\Sigma|\boldsymbol\mu, \mathbf\Phi, m) =(2\pi)^{-\frac{p}{2}}|\mathbf\Sigma|^{-\frac{1}{2}}\exp\left\{-\frac{1}{2}(\boldsymbol x- \boldsymbol\mu)’\mathbf\Sigma^{-1}(\boldsymbol x- \boldsymbol\mu) \right\}\cdot\frac{ |\mathbf\Psi|^{\frac{m}{2}}|\mathbf\Sigma|^{-\frac{m+p+1}{2}}\exp\left\{-\frac{1}{2}\mathrm{tr}\mathbf\Psi\mathbf\Sigma^{-1}\right\} }{2^{\frac{mp}{2}}\Gamma_p\left(\frac{m}{2}\right)} ,$
where $\Gamma_p(\cdot)$ is a multivariate gamma function.
We have
$f(\boldsymbol x|\boldsymbol\mu,\mathbf\Psi, m) = \pi^{-\frac{p}{2}}\frac{|\mathbf\Psi|^{\frac{m}{2}}\Gamma_p\left(\frac{m+1}{2}\right)}{|(\boldsymbol x – \boldsymbol\mu)(\boldsymbol x – \boldsymbol\mu)’ + \mathbf\Psi|^{\frac{m+1}{2}}\Gamma_p\left(\frac{m}{2}\right)}\times$
$\int_{\mathcal X_{\mathbf\Sigma}}\frac{ |(\boldsymbol x – \boldsymbol\mu)(\boldsymbol x – \boldsymbol\mu)’+ \mathbf\Psi|^{\frac{m+1}{2}}|\mathbf\Sigma|^{-\frac{(m+p+1)+1}{2}}\exp\left\{-\frac{1}{2}\mathrm{tr}\left[(\boldsymbol x – \boldsymbol\mu)(\boldsymbol x – \boldsymbol\mu)’+\mathbf\Psi\right]\mathbf\Sigma^{-1}\right\} }{2^{\frac{(m+1)p}{2}}\Gamma_p\left(\frac{m+1}{2}\right)}d\mathbf\Sigma$
thus we have the posterior density function of $\mathbf\Sigma$ is
$f(\mathbf\Sigma|\boldsymbol x,\mathbf\Phi, m) = \frac{f(\mathbf\Sigma,\boldsymbol x|\mathbf\Psi, m)}{f(\boldsymbol x|\mathbf\Psi, m)} = \frac{ |(\boldsymbol x – \boldsymbol\mu)(\boldsymbol x – \boldsymbol\mu)’+ \mathbf\Psi|^{\frac{m+1}{2}}|\mathbf\Sigma|^{-\frac{(m+p+1)+1}{2}}\exp\left\{-\frac{1}{2}\mathrm{tr}\left[(\boldsymbol x – \boldsymbol\mu)(\boldsymbol x – \boldsymbol\mu)’+\mathbf\Psi\right]\mathbf\Sigma^{-1}\right\} }{2^{\frac{(m+1)p}{2}}\Gamma_p\left(\frac{m+1}{2}\right)}$
which is in the form of $\mathscr W_p^{-1}([(\boldsymbol x – \boldsymbol\mu)(\boldsymbol x – \boldsymbol\mu)’+ \mathbf\Psi]^{-1}, m+1)$’s
PDF. That is, wishart distribution has a conjugate pair with
respect to a multivariate distribution.

## 到UConn後的第一Po

### October 3, 2008

For principle components $\mathbf\Lambda$