We all know about Neyman-Pearson, Lehmann & Romano (2005) provided a better illustration of it: value per dollar and miles per hour.
which means, for fixed type-I error \[\alpha = \int_{\Omega_{H_0}}\phi(x)d\mathscr P_\theta(x) \]
we want the maximum of power (1 – type II error):
\[\beta = \int_{\Omega_{H_0^c}}\phi(x)d\mathscr P_\theta(x) \]
where $\phi(x)$ denotes the probability of desired outcome occurs in random experiment $R$.
thus, what we want is
\[r(x) = \frac{\beta}{\alpha} = \frac{\int_{\Omega_{H_0^c}}\phi(x)d\mathscr P_\theta(x)}{\int_{\Omega_{H_0}}\phi(x)d\mathscr P_\theta(x)}\]
to be maximized.

\[ (\boldsymbol x – \boldsymbol a)’\mathbf A(\boldsymbol x – \boldsymbol a) + (\boldsymbol x – \boldsymbol b)’\mathbf
B(\boldsymbol x – \boldsymbol b) = \boldsymbol x'(\mathbf A + \mathbf B)\boldsymbol x – \boldsymbol
x'(\mathbf A\boldsymbol a + \mathbf B\boldsymbol b) – (\mathbf A\boldsymbol a + \mathbf B\boldsymbol b)’\boldsymbol
x + \boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf B\boldsymbol b \]
\[ = \boldsymbol x'(\mathbf A + \mathbf B)\boldsymbol x – \boldsymbol
x'(\mathbf A + \mathbf B)(\mathbf A + \mathbf B)^{-1}(\mathbf A\boldsymbol a +
\mathbf B\boldsymbol b) –
(\mathbf A\boldsymbol a +
\mathbf B\boldsymbol b)’ (\mathbf A + \mathbf B)^{-1}(\mathbf A + \mathbf
B)\boldsymbol x\]
\[ + \boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf B\boldsymbol b \]
\[ = \boldsymbol x'(\mathbf A + \mathbf B)\boldsymbol x – \boldsymbol
x'(\mathbf A + \mathbf B)\boldsymbol c – \boldsymbol c'(\mathbf A + \mathbf B)\boldsymbol x +
\boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf B\boldsymbol b \]

\[ = (\boldsymbol x – \boldsymbol c)'(\mathbf A + \mathbf B)(\boldsymbol x – \boldsymbol c) – \boldsymbol
c'(\mathbf A + \mathbf B)\boldsymbol c + \boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf
B\boldsymbol b \]
and
\[ \boldsymbol
c'(\mathbf A + \mathbf B)\boldsymbol c = (\mathbf A\boldsymbol a – \mathbf B\boldsymbol
a)'(\mathbf A + \mathbf B)^{-1}(\mathbf A\boldsymbol a – \mathbf B\boldsymbol a) =
[\mathbf A(\boldsymbol a – \boldsymbol b) + (\mathbf B + \mathbf A)\boldsymbol b]'(\mathbf A +
\mathbf B)^{-1} [(\mathbf A + \mathbf B)\boldsymbol a – \mathbf B(\boldsymbol a – \boldsymbol
b)]\]

\[ = (\boldsymbol a – \boldsymbol b)’\mathbf A\boldsymbol a + \boldsymbol b'(\mathbf A + \mathbf B)\boldsymbol
a – \boldsymbol b’\mathbf B(\boldsymbol a – \boldsymbol b) – (\boldsymbol a – \boldsymbol
b)’\mathbf A(\mathbf A + \mathbf B)^{-1}\mathbf B(\boldsymbol a – \boldsymbol b)\]
\[ = \boldsymbol a’\mathbf A\boldsymbol a – \boldsymbol b’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf
A\boldsymbol a + \boldsymbol b’\mathbf B\boldsymbol a – \boldsymbol b’\mathbf B\boldsymbol a + \boldsymbol b’\mathbf
B\boldsymbol b – (\boldsymbol a – \boldsymbol
b)’\mathbf A(\mathbf A + \mathbf B)^{-1}\mathbf B(\boldsymbol a – \boldsymbol b)\]
thus,
\[ \boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf
B\boldsymbol b – \boldsymbol
c'(\mathbf A + \mathbf B)\boldsymbol c = \boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf
B\boldsymbol b – \boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf
A\boldsymbol a – \boldsymbol b’\mathbf A\boldsymbol a – \boldsymbol b’\mathbf B\boldsymbol a + \boldsymbol b’\mathbf
B\boldsymbol a – \boldsymbol b’\mathbf B\boldsymbol b + (\boldsymbol a – \boldsymbol
b)’\mathbf A(\mathbf A + \mathbf B)^{-1}\mathbf B(\boldsymbol a – \boldsymbol b) \]

\[= (\boldsymbol a – \boldsymbol
b)’\mathbf A(\mathbf A + \mathbf B)^{-1}\mathbf B(\boldsymbol a – \boldsymbol b) \]
thus,
\[ (\boldsymbol x – \boldsymbol a)’\mathbf A(\boldsymbol x – \boldsymbol a) + (\boldsymbol x – \boldsymbol b)’\mathbf
B(\boldsymbol x – \boldsymbol b) = (\boldsymbol x – \boldsymbol c)'(\mathbf A + \mathbf B)(\boldsymbol x – \boldsymbol
c) + (\boldsymbol a – \boldsymbol
b)’\mathbf A(\mathbf A + \mathbf B)^{-1}\mathbf B(\boldsymbol a – \boldsymbol b)\]

\[ (\boldsymbol x – \boldsymbol a)’\mathbf A(\boldsymbol x – \boldsymbol a) + (\boldsymbol x – \boldsymbol b)’\mathbf
B(\boldsymbol x – \boldsymbol b) = \boldsymbol x'(\mathbf A + \mathbf B)\boldsymbol x – \boldsymbol
x'(\mathbf A\boldsymbol a + \mathbf B\boldsymbol b) – (\mathbf A\boldsymbol a + \mathbf B\boldsymbol b)’\boldsymbol
x + \boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf B\boldsymbol b \]
\[ = \boldsymbol x'(\mathbf A + \mathbf B)\boldsymbol x – \boldsymbol
x'(\mathbf A + \mathbf B)(\mathbf A + \mathbf B)^{-1}(\mathbf A\boldsymbol a +
\mathbf B\boldsymbol b) –
(\mathbf A\boldsymbol a +
\mathbf B\boldsymbol b)’ (\mathbf A + \mathbf B)^{-1}(\mathbf A + \mathbf
B)\boldsymbol x\]
\[ + \boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf B\boldsymbol b \]
\[ = \boldsymbol x'(\mathbf A + \mathbf B)\boldsymbol x – \boldsymbol
x'(\mathbf A + \mathbf B)\boldsymbol c – \boldsymbol c'(\mathbf A + \mathbf B)\boldsymbol x +
\boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf B\boldsymbol b \]

\[ = (\boldsymbol x – \boldsymbol c)'(\mathbf A + \mathbf B)(\boldsymbol x – \boldsymbol c) – \boldsymbol
c'(\mathbf A + \mathbf B)\boldsymbol c + \boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf
B\boldsymbol b \]
and
\[ \boldsymbol
c'(\mathbf A + \mathbf B)\boldsymbol c = (\mathbf A\boldsymbol a – \mathbf B\boldsymbol
a)'(\mathbf A + \mathbf B)^{-1}(\mathbf A\boldsymbol a – \mathbf B\boldsymbol a) =
[\mathbf A(\boldsymbol a – \boldsymbol b) + (\mathbf B + \mathbf A)\boldsymbol b]'(\mathbf A +
\mathbf B)^{-1} [(\mathbf A + \mathbf B)\boldsymbol a – \mathbf B(\boldsymbol a – \boldsymbol
b)]\]

\[ = (\boldsymbol a – \boldsymbol b)’\mathbf A\boldsymbol a + \boldsymbol b'(\mathbf A + \mathbf B)\boldsymbol
a – \boldsymbol b’\mathbf B(\boldsymbol a – \boldsymbol b) – (\boldsymbol a – \boldsymbol
b)’\mathbf A(\mathbf A + \mathbf B)^{-1}\mathbf B(\boldsymbol a – \boldsymbol b)\]
\[ = \boldsymbol a’\mathbf A\boldsymbol a – \boldsymbol b’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf
A\boldsymbol a + \boldsymbol b’\mathbf B\boldsymbol a – \boldsymbol b’\mathbf B\boldsymbol a + \boldsymbol b’\mathbf
B\boldsymbol b – (\boldsymbol a – \boldsymbol
b)’\mathbf A(\mathbf A + \mathbf B)^{-1}\mathbf B(\boldsymbol a – \boldsymbol b)\]
thus,
\[ \boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf
B\boldsymbol b – \boldsymbol
c'(\mathbf A + \mathbf B)\boldsymbol c = \boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf
B\boldsymbol b – \boldsymbol a’\mathbf A\boldsymbol a + \boldsymbol b’\mathbf
A\boldsymbol a – \boldsymbol b’\mathbf A\boldsymbol a – \boldsymbol b’\mathbf B\boldsymbol a + \boldsymbol b’\mathbf
B\boldsymbol a – \boldsymbol b’\mathbf B\boldsymbol b + (\boldsymbol a – \boldsymbol
b)’\mathbf A(\mathbf A + \mathbf B)^{-1}\mathbf B(\boldsymbol a – \boldsymbol b) \]

\[= (\boldsymbol a – \boldsymbol
b)’\mathbf A(\mathbf A + \mathbf B)^{-1}\mathbf B(\boldsymbol a – \boldsymbol b) \]
thus,
\[ (\boldsymbol x – \boldsymbol a)’\mathbf A(\boldsymbol x – \boldsymbol a) + (\boldsymbol x – \boldsymbol b)’\mathbf
B(\boldsymbol x – \boldsymbol b) = (\boldsymbol x – \boldsymbol c)'(\mathbf A + \mathbf B)(\boldsymbol x – \boldsymbol
c) + (\boldsymbol a – \boldsymbol
b)’\mathbf A(\mathbf A + \mathbf B)^{-1}\mathbf B(\boldsymbol a – \boldsymbol b)\]

Conjugate Families

October 22, 2008

  • Multinomial-Dirichlet:
    Let $\boldsymbol x = (x_1, x_2,\ldots, x_k)’ \sim MN(n,\boldsymbol p)$ and $\boldsymbol p
    = (p_1, p_2, \ldots, p_k)\sim Dirichlet(\boldsymbol\alpha)$ where
    $\boldsymbol\alpha = (\alpha_1, \alpha_2, \ldots, \alpha_k)$, and for all
    $\alpha_i$’s are known. We have the
    full joint denisty function that
    \[f(\boldsymbol x, \boldsymbol p |\boldsymbol\alpha, n) =
    n!\prod_{i=1}^k\frac{p^{x_i}}{x_i!}\Gamma\left(\sum_{i=1}^k\alpha_i\right)\prod_{i=1}^k\frac{p^{\alpha_1
    -1}}{\Gamma(\alpha_i)} = n!\Gamma\left(\sum_{i=1}^k\alpha_i\right)
    \prod_{i=1}^k\frac{p^{x_i}p^{\alpha_1 -1}}{x_i!\Gamma(\alpha_i)}
    = n!\Gamma\left(\sum_{i=1}^k\alpha_i\right)
    \prod_{i=1}^k\frac{p^{x_i+\alpha_1 -1}}{x_i!\Gamma(\alpha_i)}, \]
    thus, we have
    \[ f(\boldsymbol x |n, \boldsymbol\alpha) = \int_{\mathcal X_{\boldsymbol p}} f(\boldsymbol x,\boldsymbol
    p|\boldsymbol\alpha, n) d\boldsymbol p = \int_{\mathcal X_{\boldsymbol p}} n!\Gamma\left(\sum_{i=1}^k\alpha_i\right)
    \prod_{i=1}^k\frac{p^{x_i+\alpha_1 -1}}{x_i!\Gamma(\alpha_i)}d\boldsymbol
    p =
    \frac{n!\Gamma\left(\sum_{i=1}^k\alpha_i\right)}{\Gamma\left(\sum_{i=1}^k\alpha_i
    + x_i\right)}
    \prod_{i=1}^k\frac{\Gamma(\alpha_i+x_i)}{x_i!\Gamma(\alpha_i)},\]
    thus,
    \[f(\boldsymbol p|\boldsymbol x, \boldsymbol\alpha, n) = \frac{f(\boldsymbol p,\boldsymbol x|\boldsymbol\alpha,
    n)}{f(\boldsymbol x|\boldsymbol\alpha, n)} =
    \Gamma\left(\sum_{i=1}^k\alpha_i+x_i\right)\prod_{i=1}^k\frac{p^{\alpha_1
    + x_i
    -1}}{\Gamma(\alpha_i+x_i)},\]
    that is, the parameters of a multinomial distribution has a
    conjugate pair in the form of Dirichlet distribution.
  • Multivariate Normal – Multivariate Normal:
    Let $\boldsymbol x\sim\mathscr N_p(\boldsymbol\mu, \mathbf\Sigma)$ and
    $\boldsymbol\mu\sim\mathscr N_p(\boldsymbol\xi, \mathbf\Omega)$ where $\boldsymbol\xi,
    \mathbf\Sigma, \mathbf\Omega$ are all known; then we have the
    joint density function
    \[f(\boldsymbol x,\boldsymbol\mu|\boldsymbol\xi, \mathbf\Sigma,\mathbf\Omega) =
    (2\pi)^{-p}|\mathbf\Sigma|^{-\frac{1}{2}}|\mathbf\Omega|^{-\frac{1}{2}}\exp\left\{-\frac{1}{2}\left[(\boldsymbol
    x- \boldsymbol\mu)’\mathbf\Sigma^{-1}(\boldsymbol
    x- \boldsymbol\mu) + (\boldsymbol
    \mu- \boldsymbol\xi)’\mathbf\Omega^{-1}(\boldsymbol
    \mu- \boldsymbol\xi)\right]\right\}, \]
    and the marinal distribution of $\boldsymbol x$ is
    \[ f(\boldsymbol x|\boldsymbol\xi,\mathbf\Sigma, \mathbf\Omega) = \int_{\mathcal X_{\boldsymbol\mu}}(2\pi)^{-p}|\mathbf\Sigma|^{-\frac{1}{2}}|\mathbf\Omega|^{-\frac{1}{2}}\exp\left\{-\frac{1}{2}\left[(\boldsymbol
    x- \boldsymbol\mu)’\mathbf\Sigma^{-1}(\boldsymbol
    x- \boldsymbol\mu) + (\boldsymbol
    \mu- \boldsymbol\xi)’\mathbf\Omega^{-1}(\boldsymbol
    \mu- \boldsymbol\xi)\right]\right\}d\boldsymbol\mu\]
    \[ =
    (2\pi)^{-\frac{p}{2}}|\mathbf\Theta|^{\frac{1}{2}}|\mathbf\Sigma|^{-\frac{1}{2}}|\mathbf\Omega|
    ^{-\frac{1}{2}}\exp\left\{-\frac{1}{2}\left[\boldsymbol
    x’\mathbf\Sigma^{-1}\boldsymbol x +\boldsymbol\xi’\mathbf\Omega^{-1}\boldsymbol\xi – (\mathbf\Omega^{-1}\boldsymbol
    \xi + \mathbf\Sigma^{-1}\boldsymbol x)’\mathbf\Theta(\mathbf\Omega^{-1}\boldsymbol
    \xi + \mathbf\Sigma^{-1}\boldsymbol x)\right]\right\}\times \]
    \[\int_{\mathcal X_{\boldsymbol\mu}}
    (2\pi)^{-\frac{p}{2}}|\mathbf\Theta|^{-\frac{1}{2}}\exp\left\{-\frac{1}{2}(\boldsymbol\mu
    – \boldsymbol\eta)’\mathbf\Theta^{-1}(\boldsymbol\mu – \boldsymbol\eta)\right\}d\boldsymbol\mu\]
    where $\mathbf\Theta = \left(\mathbf\Sigma^{-1} +
    \mathbf\Omega^{-1}\right)^{-1}$, $\boldsymbol\eta = \mathbf\Theta\left(\mathbf\Omega^{-1}\boldsymbol
    \xi + \mathbf\Sigma^{-1}\boldsymbol x\right)$, thus, we have
    \[ f(\boldsymbol\mu|\boldsymbol x,\mathbf\Sigma,\mathbf\Omega) =
    \frac{f(\boldsymbol\mu, \boldsymbol x|\mathbf\Sigma,\mathbf\Omega)}{f(\boldsymbol x|\mathbf\Sigma,\mathbf\Omega)} =
    (2\pi)^{-\frac{p}{2}}|\mathbf\Theta|^{-\frac{1}{2}}\exp\left\{-\frac{1}{2}(\boldsymbol\mu
    – \boldsymbol\eta)’\mathbf\Theta^{-1}(\boldsymbol\mu – \boldsymbol\eta)\right\}\]
    which is in the form of $\mathscr N_p(\boldsymbol\eta,\mathbf\Theta)$.
    Then we have the mean of a multivariate normal distribution has a
    conjugate pair in prior and posterior.
  • Mutlivariate Normal – Inverse Wishart:
    Let $\boldsymbol y\sim\mathscr N_p(\boldsymbol\mu, \mathbf\Sigma)$ and $\mathbf\Sigma\sim
    \mathscr W_p^{-1}(\mathbf\Psi^{-1}, m)$, where $\boldsymbol\mu,
    \mathbf\Phi, m$ are all known. We have a joint density function
    \[f(\boldsymbol x,\mathbf\Sigma|\boldsymbol\mu, \mathbf\Phi, m)
    =(2\pi)^{-\frac{p}{2}}|\mathbf\Sigma|^{-\frac{1}{2}}\exp\left\{-\frac{1}{2}(\boldsymbol
    x- \boldsymbol\mu)’\mathbf\Sigma^{-1}(\boldsymbol x- \boldsymbol\mu) \right\}\cdot\frac{
    |\mathbf\Psi|^{\frac{m}{2}}|\mathbf\Sigma|^{-\frac{m+p+1}{2}}\exp\left\{-\frac{1}{2}\mathrm{tr}\mathbf\Psi\mathbf\Sigma^{-1}\right\}
    }{2^{\frac{mp}{2}}\Gamma_p\left(\frac{m}{2}\right)} ,\]
    where $\Gamma_p(\cdot)$ is a multivariate gamma function.
    We have
    \[f(\boldsymbol x|\boldsymbol\mu,\mathbf\Psi, m) =
    \pi^{-\frac{p}{2}}\frac{|\mathbf\Psi|^{\frac{m}{2}}\Gamma_p\left(\frac{m+1}{2}\right)}{|(\boldsymbol x –
    \boldsymbol\mu)(\boldsymbol x – \boldsymbol\mu)’ +
    \mathbf\Psi|^{\frac{m+1}{2}}\Gamma_p\left(\frac{m}{2}\right)}\times\]
    \[\int_{\mathcal X_{\mathbf\Sigma}}\frac{
    |(\boldsymbol x –
    \boldsymbol\mu)(\boldsymbol x – \boldsymbol\mu)’+ \mathbf\Psi|^{\frac{m+1}{2}}|\mathbf\Sigma|^{-\frac{(m+p+1)+1}{2}}\exp\left\{-\frac{1}{2}\mathrm{tr}\left[(\boldsymbol x –
    \boldsymbol\mu)(\boldsymbol x – \boldsymbol\mu)’+\mathbf\Psi\right]\mathbf\Sigma^{-1}\right\}
    }{2^{\frac{(m+1)p}{2}}\Gamma_p\left(\frac{m+1}{2}\right)}d\mathbf\Sigma\]
    thus we have the posterior density function of $\mathbf\Sigma$ is
    \[ f(\mathbf\Sigma|\boldsymbol x,\mathbf\Phi, m) = \frac{f(\mathbf\Sigma,\boldsymbol x|\mathbf\Psi, m)}{f(\boldsymbol x|\mathbf\Psi, m)} = \frac{
    |(\boldsymbol x –
    \boldsymbol\mu)(\boldsymbol x – \boldsymbol\mu)’+ \mathbf\Psi|^{\frac{m+1}{2}}|\mathbf\Sigma|^{-\frac{(m+p+1)+1}{2}}\exp\left\{-\frac{1}{2}\mathrm{tr}\left[(\boldsymbol x –
    \boldsymbol\mu)(\boldsymbol x – \boldsymbol\mu)’+\mathbf\Psi\right]\mathbf\Sigma^{-1}\right\}
    }{2^{\frac{(m+1)p}{2}}\Gamma_p\left(\frac{m+1}{2}\right)}\]
    which is in the form of $\mathscr W_p^{-1}([(\boldsymbol x –
    \boldsymbol\mu)(\boldsymbol x – \boldsymbol\mu)’+ \mathbf\Psi]^{-1}, m+1)$’s
    PDF. That is, wishart distribution has a conjugate pair with
    respect to a multivariate distribution.

到UConn後的第一Po

October 3, 2008

來米將近九十天了,但還是宅得跟在台灣時沒多大區別。

For principle components $\mathbf\Lambda$