A refreshment of concepts based on notes from Matthew and David.
$\newcommand{\bs}[1]{\boldsymbol{#1}}$ $\DeclareMathOperator*{\diag}{diag}$ $\DeclareMathOperator*{\cov}{cov}$ $\DeclareMathOperator*{\rank}{rank}$ $\DeclareMathOperator*{\var}{var}$ $\DeclareMathOperator*{\tr}{tr}$ $\DeclareMathOperator*{\veco}{vec}$ $\DeclareMathOperator*{\uniform}{\mathcal{U}niform}$ $\DeclareMathOperator*{\argmin}{arg\ min}$ $\DeclareMathOperator*{\argmax}{arg\ max}$ $\DeclareMathOperator*{\N}{N}$ $\DeclareMathOperator*{\gm}{Gamma}$ $\DeclareMathOperator*{\dif}{d}$
We need to make MLE of $\theta$ for $P(Y|\theta)$. The marginal $$\log P(Y|\theta) = \log \int P(Y,Z|\theta) \dif Z$$ is difficult to compute. However $$\log P(Y|\theta) = \log \int P(Y|Z, \theta) P(Z|\theta) \dif Z$$ is easier. Note that $P(Y|Z, \theta)$ is likelihood, $P(Z|\theta)$ is posterior.
For any normalized distribution $q(Z)$,
\begin{align} \log P(Y|\theta) & = \log P(Y|\theta) \int_Z q(Z) \dif Z \\ & = \int_Z \log P(Y|\theta) q(Z) \dif Z \\ & = \int_Z \log \frac{P(Y,Z|\theta)}{P(Z|Y,\theta)} q(Z) \dif Z \\ & = \int_Z \log \frac{P(Y,Z|\theta)q(Z)}{P(Z|Y,\theta)q(Z)} q(Z) \dif Z \\ & = \int_Z \log \frac{P(Y|Z,\theta)P(Z|\theta)}{q(Z)} \frac{q(Z)}{P(Z|Y,\theta)} q(Z) \dif Z \\ & = \int_Z \log P(Y|Z,\theta) q(Z) \dif Z + \int_Z \frac{P(Z|\theta)}{q(Z)} q(Z) \dif Z + \int_Z \frac{q(Z)}{P(Z|Y,\theta)} q(Z) \dif Z \\ & = E_q[\log P(Y|Z,\theta)] + E_q[\frac{P(Z|\theta)}{q(Z)}] - E_q[\frac{P(Z|Y,\theta)}{q(Z)}] \\ & \ge E_q[\log P(Y|Z,\theta)] + E_q[\frac{P(Z|\theta)}{q(Z)}] \\ & = \mathcal{L}(q, \theta) \end{align}Maximize $\mathcal{L}(q, \theta)$ iteratively between $q$ and $\theta$, in a 2-step procedure: