Multivariate Bayesian variable selection regression

Notes on fundamentals of EM and VEM

A refreshment of concepts based on notes from Matthew and David.

$\newcommand{\bs}[1]{\boldsymbol{#1}}$ $\DeclareMathOperator*{\diag}{diag}$ $\DeclareMathOperator*{\cov}{cov}$ $\DeclareMathOperator*{\rank}{rank}$ $\DeclareMathOperator*{\var}{var}$ $\DeclareMathOperator*{\tr}{tr}$ $\DeclareMathOperator*{\veco}{vec}$ $\DeclareMathOperator*{\uniform}{\mathcal{U}niform}$ $\DeclareMathOperator*{\argmin}{arg\ min}$ $\DeclareMathOperator*{\argmax}{arg\ max}$ $\DeclareMathOperator*{\N}{N}$ $\DeclareMathOperator*{\gm}{Gamma}$ $\DeclareMathOperator*{\dif}{d}$

Problem

We need to make MLE of $\theta$ for $P(Y|\theta)$. The marginal $$\log P(Y|\theta) = \log \int P(Y,Z|\theta) \dif Z$$ is difficult to compute. However $$\log P(Y|\theta) = \log \int P(Y|Z, \theta) P(Z|\theta) \dif Z$$ is easier. Note that $P(Y|Z, \theta)$ is likelihood, $P(Z|\theta)$ is posterior.

Marginal log-likelihood

For any normalized distribution $q(Z)$,

\begin{align} \log P(Y|\theta) & = \log P(Y|\theta) \int_Z q(Z) \dif Z \\ & = \int_Z \log P(Y|\theta) q(Z) \dif Z \\ & = \int_Z \log \frac{P(Y,Z|\theta)}{P(Z|Y,\theta)} q(Z) \dif Z \\ & = \int_Z \log \frac{P(Y,Z|\theta)q(Z)}{P(Z|Y,\theta)q(Z)} q(Z) \dif Z \\ & = \int_Z \log \frac{P(Y|Z,\theta)P(Z|\theta)}{q(Z)} \frac{q(Z)}{P(Z|Y,\theta)} q(Z) \dif Z \\ & = \int_Z \log P(Y|Z,\theta) q(Z) \dif Z + \int_Z \frac{P(Z|\theta)}{q(Z)} q(Z) \dif Z + \int_Z \frac{q(Z)}{P(Z|Y,\theta)} q(Z) \dif Z \\ & = E_q[\log P(Y|Z,\theta)] + E_q[\frac{P(Z|\theta)}{q(Z)}] - E_q[\frac{P(Z|Y,\theta)}{q(Z)}] \\ & \ge E_q[\log P(Y|Z,\theta)] + E_q[\frac{P(Z|\theta)}{q(Z)}] \\ & = \mathcal{L}(q, \theta) \end{align}

EM

Maximize $\mathcal{L}(q, \theta)$ iteratively between $q$ and $\theta$, in a 2-step procedure:

  1. $\underset{q}{max} \big( \log P(Y|\theta) \big ) $: simply set $q(Z) = P(Z|Y,\theta^{(current)})$
  2. $\underset{\theta}{max} \big( E_q[\log P(Y|Z,\theta)] + E_q[\frac{P(Z|\theta)}{P(Z|Y,\theta^{(current)})}] \big)$

VEM via mean-field

  1. $\underset{q \in \Omega}{max} \big( \log P(Y|\theta) \big ) $: set $q(Z) = \prod_i q(Z_i)$
  2. $\underset{\theta}{max} \big( E_q[\log P(Y|Z,\theta)] + E_q[\frac{P(Z|\theta)}{\prod_i q(Z_i)}] \big)$

Copyright © 2016-2020 Gao Wang et al at Stephens Lab, University of Chicago