A smaller example of prior variance scalar estimate¶

This is continuation of previous notebook but using smaller example and more explicit code to show the problem. The data can be downloaded here.

The problems from previous notebook are:

EM converges slowly
The estimated prior variane scalar seems smaller than what we expect based on the coefficient estimate itself.

This notebook focuses on understanding 2. Therefore we use optim based result and manually check for a couple of iterations the inference for the 1st and 2nd moments, see if they are in agreement with the estimates.

attach(readRDS('em_optim_difference.rds'))

Here I use 15 conditions (this is enough conditions to show EM slow convergence issue),

R = 15
Y1 = Y[, c(1:R)]
resid_Y1 = resid_Y[c(1:R), c(1:R)]
V1 = matrix(1,R,R)
true_coef = meta$true_coef[,c(1:R)]

To simplify, I use codes from BayesianMultivariateRegression class from mvsusieR package which is a single component multivariate prior (not a mixture prior).

IBSS done manually¶

Here I run single effect models manually for L=2, using save_var to save posterior covariance matrix

devtools::load_all('~/GIT/software/mvsusieR')
d = DenseData$new(X,Y1)
d$standardize(TRUE,TRUE)
m1 = SingleEffectModel(BayesianMultivariateRegression)$new(ncol(X), resid_Y1, V1)
m2 = SingleEffectModel(BayesianMultivariateRegression)$new(ncol(X), resid_Y1, V1)

Loading mvsusieR

Loading required package: mashr

Loading required package: ashr

The first iteration¶

d$add_to_residual(m1$predict(d))
m1$fit(d, estimate_prior_variance_method = 'optim', save_var = TRUE)
d$remove_from_residual(m1$predict(d))
d$add_to_residual(m2$predict(d))
m2$fit(d, estimate_prior_variance_method = 'optim', save_var = TRUE)
d$remove_from_residual(m2$predict(d))

fitted = d$compute_Xb(m1$posterior_b1 + m2$posterior_b1)
d$compute_residual(fitted)

Now take a look at position 129, whose signal is captured by L2 with PIP > 0.9, to check the 2nd moment estimate and prior variance scalar,

m2$pip[129]

sum(diag(tcrossprod(m2$posterior_b1[129,])))/R

m2$prior_variance

The 2nd iteration¶

d$add_to_residual(m1$predict(d))
m1$fit(d, estimate_prior_variance_method = 'optim', save_var = TRUE)
d$remove_from_residual(m1$predict(d))
d$add_to_residual(m2$predict(d))
m2$fit(d, estimate_prior_variance_method = 'optim', save_var = TRUE)
d$remove_from_residual(m2$predict(d))

m2$pip[129]

sum(diag(tcrossprod(m2$posterior_b1[129,])))/R

m2$prior_variance

The 3rd iteration¶

d$add_to_residual(m1$predict(d))
m1$fit(d, estimate_prior_variance_method = 'optim', save_var = TRUE)
d$remove_from_residual(m1$predict(d))
d$add_to_residual(m2$predict(d))
m2$fit(d, estimate_prior_variance_method = 'optim', save_var = TRUE)
d$remove_from_residual(m2$predict(d))

sum(diag(tcrossprod(m2$posterior_b1[129,])))/R

m2$prior_variance

It seems to be close to convergence. The estimate from this model is different from the simulated truth,

m2$posterior_b1[129,]

true_coef[129,]

I think the disagreement is understandable because the "all shared" prior model is very much mis-specified for these true coefficients.

Notice previous notebook I made a mistake using the coefficient based on rescaled data,

wrong_b1 = d$rescale_coef(m2$posterior_b1)[130,]

sum(diag(tcrossprod(wrong_b1)))/R

This is inconsistent with estimated prior scalar because of the scale issue.