上面這個圖是用下面的R程式碼畫的

require(ggExtra)
require(animation)

x <- seq(-4.2, 4.2, length=10000)

saveMovie(
          for (i in 1:10) {
            end.pin <- 1000*i
            dat1 <- data.frame(x=x[1:end.pin], y=pnorm(x[1:end.pin]),z=dnorm(x[1:end.pin]))
            h1 <- ggplot(dat1, aes(x=x, y=y))
            h1 <- h1 + xlim(min(x), max(x)) + ylim(0, 1)            
            h1 = dnorm(x[end.pin])) {
              h1 <- h1 + geom_ribbon(aes(y=y, ymin=0, ymax=y), fill='pink', alpha=0.75)
              h1 <- h1 + geom_ribbon(aes(y=z, ymin=0, ymax=z), fill='cyan', alpha=0.75)
            } else {
              h1 <- h1 + geom_ribbon(aes(y=z, ymin=0, ymax=z), fill='cyan', alpha=0.75)
              h1 <- h1 + geom_ribbon(aes(y=y, ymin=0, ymax=y), fill='pink', alpha=0.75)
            }
            h1 <- h1 + geom_text(aes(x=-2.4, y=0.7, label=paste('x=', round(x[end.pin],3))), size=8, colour='darkgreen')
            h1 <- h1 + geom_text(aes(x=0.2, y=0.2, label="φ(x)="), size=8, colour='blue')            
            h1 <- h1 + geom_text(aes(x=2.0, y=0.2, label=round(dnorm(x[end.pin]), 3)), size=8, colour='blue')            
            h1 <- h1 + geom_text(aes(x=-1.4, y=0.5, label="Φ(x)="), size=8, colour='red')
            h1 <- h1 + geom_text(aes(x=0.4, y=0.5, label=round(pnorm(x[end.pin]), 3)), size=8, colour='red')
            h1 <- h1 + geom_line(colour='red')
            h1 <- h1 + geom_line(aes(y=z), colour='blue')
            print(h1)            
          }, moviename="normal_dist_ggplot2.gif")

Consider a collection of random variables $\boldsymbol x = (x_1, x_2, \ldots, x_k)^T$, if $\boldsymbol x$ have a joint characteristic function
$\phi_{\boldsymbol x} (\boldsymbol t)$, where $\boldsymbol t = (t_1,t_2,\ldots, t_k)^T$ are all distinct, then we can conclude that $x_i, x_j$ are independent for all $i\neq j$ if and only if
\[
\phi_{\boldsymbol x} (\boldsymbol t) = \prod_{i=1}^k\phi_{X_i}(t_i)
\]

Note that the above statement does not hold if the distinction property among the $t_i$’s does not exist, that means, the condition that
\[
\phi_{\boldsymbol x} (t\boldsymbol 1) = \prod_{i=1}^k\phi_{X_i}(t)
\quad
\text{ does not imply that}\quad X_i’s\quad\text{ are all independent}
\]
The reason is apparent, think about the case $t=0$.

For $X\sim t(\nu)$,
\[
E(X^k) = \int_{\mathbb R^1}\frac{\Gamma\left(\frac{\nu+1}{2}\right)}{\Gamma\left(\frac{\nu+1}{2}\right)\sqrt{\pi\nu}}x^k\left(\frac{\nu}{\nu+x^2}\right)^{\frac{\nu+1}{2}}
dx
\]
\[
= \int_{\mathbb R^1}\frac{\Gamma\left(\frac{\nu+1}{2}\right)}{\Gamma\left(\frac{\nu+1}{2}\right)\sqrt{\pi\nu}}\left(\frac{x^2}{\nu+x^2}\right)^{\frac{k}{2}} \left(\frac{\nu}{\nu+x^2}\right)^{\frac{\nu-k+1}{2}}
v^{\frac{k}{2}}dx
\]
\[
= \int_0^1\frac{\Gamma\left(\frac{\nu+1}{2}\right)\nu^{\frac{k}{2}}}{\Gamma\left(\frac{\nu+1}{2}\right)\sqrt{\pi\nu}}\left[\frac{1}{2}\nu^{\frac{1}{2}}y^{-\frac{1}{2}}(1-y)^{-\frac{1}{2}} + \frac{1}{2}\nu^{\frac{1}{2}}y^{\frac{1}{2}}(1-y)^{-\frac{3}{2}} \right]y^{\frac{k}{2}}(1-y)^{\frac{\nu+1-k}{2}}dy
\]
\[
= \frac{\Gamma\left(\frac{\nu+1}{2}\right)\nu^{\frac{k}{2}}}{\Gamma\left(\frac{\nu+1}{2}\right)2\sqrt{\pi}}\left[\int_0^1y^{\frac{k+1}{2}-1}(1-y)^{\frac{\nu-k+2}{2}-1} dy + \int_0^1 y^{\frac{k+3}{2}-1}(1-y)^{\frac{\nu-k}{2}-1}dy\right]
\]
\[
= \frac{\Gamma\left(\frac{\nu+1}{2}\right)\nu^{\frac{k}{2}}}{\Gamma\left(\frac{\nu+1}{2}\right)\Gamma\left(\frac{\nu+3}{2}\right)2\sqrt{\pi}}\left[\Gamma\left(\frac{k+1}{2}\right)\Gamma\left(\frac{\nu-k+2}{2}\right) + \Gamma\left(\frac{k+3}{2}\right)\Gamma\left(\frac{\nu-k}{2}\right)\right]
\]

Thus, for $k > \nu$, the moment does not exist.

Let $\mathscr P(X_n=-1) = \frac{1}{2} – \frac{1}{n+1}$, $\mathscr P(X_n=1)=\frac{1}{2} + \frac{1}{n+1}$ and $\mathscr P(X = -1) = \mathscr P(X=1) = \frac{1}{2}$, it clearly that

\[
\lim_{n\rightarrow\infty}\mathscr P(X_n=-1) = \frac{1}{2} – \lim_{n\rightarrow\infty}\frac{1}{n+1} = \frac{1}{2}
,\]
and
\[
\lim_{n\rightarrow\infty}\mathscr P(X_n=1) = \frac{1}{2} + \lim_{n\rightarrow\infty}\frac{1}{n+1} = \frac{1}{2}
\]

Thus, $\mathscr P(X_n=-1)\rightarrow\mathscr P(X=-1)$ and $\mathscr P(X_n=1)\rightarrow\mathscr P(X=1)$, thus, $X_n\overset{a.s.}{\rightarrow}X$, thus,$X_n\overset{P}{\rightarrow}X$ and so $X_n\overset{\mathcal D}{\rightarrow}X$

If $X_n\overset{\mathcal D}{\rightarrow}X$, and $Y_n\overset{\mathcal D}{\rightarrow}Y$, where $X_n\amalg X_n$ for all $n\in\mathbb N$, $X\amalg Y$, then

\[
\lim_{n\rightarrow\infty}\phi_{X_n+Y_n}(t) = \lim_{n\rightarrow\infty}\phi_{X_n}(t)\phi_{Y_n}(t) = \lim_{n\rightarrow\infty}\phi_{X_n}(t)\cdot\lim_{n\rightarrow\infty}\phi_{Y_n}(t) = \phi_{X}(t)\phi_{Y}(t) = \phi_{X+Y}(t)
,\]
thus, $X_n+Y_n\overset{\mathcal D}{\rightarrow}X+Y$.

Flip over the setting of the previous one post: if $X_n\overset{\mathcal D}{\rightarrow}X$, we want to show that $\mu_n\rightarrow\mu$ and $\sigma_n^2\rightarrow\sigma^2$.

Again, play with Lévy’s theorem, since we have $\lim_{n\rightarrow\infty}\phi_n(t) = \phi(t)$, thus, $\lim_{n\rightarrow\infty}\exp(\mu_n t -\frac{\sigma_n^2 t^2}{2}) = \exp(\mu t -\frac{\sigma^2 t^2}{2})$ for all $t$. Thus, we have $\lim_{n\rightarrow\infty}\mu_n = \mu$ and $\lim_{n\rightarrow\infty}\sigma_n^2 = \sigma^2$.

If $(X_n)_{n\geq 1}\sim\mathscr N (\mu_n, \sigma_n^2)$ and $\mu_n\rightarrow\mu\in\mathbb R$, $\sigma_n^2\rightarrow\sigma^2\geq 0$, then by Lévy’s theorem, then we have the probability measure of $X_n$ convergent to $\mathscr N(\mu, \sigma^2)$, thus, $\phi_n(t)=\exp(\mu_nt-\frac{\sigma_n^2t^2}{2})\rightarrow \phi (t)=\exp(\mu t-\frac{\sigma^2t^2}{2})$

Bayesian Note 2010 Oct 04

October 5, 2010

If $\left.y_i\right| \theta \sim Pois(m_i\theta)$, $i=1,2,\ldots, n$ and $\theta\sim\Gamma(\alpha, \beta)$
then
\[
\left.\theta\right|\boldsymbol y\sim\Gamma\left(\alpha+\sum_{i=1}^n y_i, \beta + \sum_{i=1}^n m_i\right)
\]

note that the PDF of Gamma distribution defined here is
\[
\frac{\beta^\alpha t^{\alpha-1}}{\Gamma(\alpha)}\exp(-\beta t)
\]

For general case,
if $\boldsymbol x\sim\mathscr N_p(\boldsymbol\mu, \mathbf\Sigma)$, and $\boldsymbol\mu\sim\mathscr N_p(\boldsymbol\theta, \mathbf\Omega)$, then $\boldsymbol\mu|\boldsymbol x\sim\mathscr N_p[(\mathbf\Sigma^{-1}+\mathbf\Omega^{-1})^{-1}(\mathbf\Sigma^{-1}\boldsymbol x+\mathbf\Omega^{-1}\boldsymbol\theta), (\mathbf\Sigma^{-1}+\mathbf\Omega^{-1})^{-1}]$
It’s just a basic quadratic form.

Update:
\[\boldsymbol x\sim\mathscr N_p(\boldsymbol\theta, \mathbf\Sigma + \mathbf\Omega)\]
due to
\[(\mathbf\Sigma + \mathbf\Omega)^{-1}= \mathbf\Sigma^{-1} – \mathbf\Sigma^{-1}(\mathbf\Sigma^{-1} +\mathbf\Omega^{-1})^{-1}\mathbf\Sigma^{-1} \]
\[
= \mathbf\Omega^{-1} – \mathbf\Omega^{-1}(\mathbf\Sigma^{-1} + \mathbf\Omega^{-1})^{-1}\mathbf\Omega^{-1}
\]
and
\[ \mathbf\Sigma^{-1}(\mathbf\Sigma^{-1} + \mathbf\Omega^{-1} )^{-1}\mathbf\Omega^{-1} = (\mathbf\Omega + \mathbf\Sigma )^{-1}\]

Consider the result:
For the conjugate prior scenairo, if $X|\mu\sim\mathcal N(\mu,
\sigma^2)$ and $\mu\sim\mathcal N(\theta, \xi^2)$ where $\sigma^2,
\theta$ and $\xi^2$ are all known, then
\[
f(\mu|x) \propto
exp\left[-\frac{1}{2}\left(\frac{1}{\sigma^2}+\frac{1}{\xi^2}\right)\left(\mu
– \frac{\xi^2y+\sigma^2\theta}{\sigma^2+\xi^2}\right)^2\right]
\]