Here I use multivariate prior (not mash mixture) for mvsusie
call, but allow for the scalar of prior to be estimated. That is, I have implemeted estimate_prior_variance
for this class.
library(mvsusieR)
set.seed(2)
dat = mvsusie_sim1(r=5,s=1)
dim(dat$X)
dim(dat$y)
true_pos = as.integer(apply(dat$b, 1, sum) != 0)
sum(true_pos)
As you can see in this simulated data-set there are 5 true effects. Now I set $L=10$. Here the prior used is not oracle prior but just 0.2 times covariance of $Y$.
L=10
start = Sys.time()
res = mvsusieR::mvsusie(dat$X,dat$y,L=L,prior_variance=dat$V,compute_objective=T,estimate_residual_variance=F,estimate_prior_variance=F)
Sys.time() - start
res$elbo
susieR::susie_plot(res,y='PIP', main = 'Default SuSiE plot for cross-condition PIP', xlab = 'SNP positions', add_legend = T, b=true_pos)
start = Sys.time()
res2 = mvsusieR::mvsusie(dat$X,dat$y,L=L,prior_variance=dat$V,compute_objective=T,estimate_residual_variance=F,estimate_prior_variance=T, estimate_prior_method='optim')
Sys.time() - start
res2$elbo
The ELBO is indeed better, but it takes much longer time -- per iteration time is about 50 times compared to when the scalar is not estimated.
The estimated scalars for prior variance are:
res2$V
As expected only the first 5 effects are non-zero.
susieR::susie_plot(res2,y='PIP', main = 'Default SuSiE plot for cross-condition PIP', xlab = 'SNP positions', add_legend = T, b=true_pos)
The PIP plot also appears a lot cleaner.
Here instead of estimating prior scalar I only compare it with setting it to zero:
start = Sys.time()
res2 = mvsusieR::mvsusie(dat$X,dat$y,L=L,prior_variance=dat$V,compute_objective=T,estimate_residual_variance=F,estimate_prior_variance=T, estimate_prior_method='simple')
Sys.time() - start
res2$elbo
res2$V
It's a lot faster but the ELBO is slighly smaller.
start = Sys.time()
res2 = mvsusie(dat$X,dat$y,L=L,prior_variance=dat$V,compute_objective=T,estimate_residual_variance=F,estimate_prior_variance=T, estimate_prior_method='EM')
Sys.time() - start
res2$elbo
res2$V
susieR::susie_plot(res2,y='PIP', main = 'Default SuSiE plot for cross-condition PIP', xlab = 'SNP positions', add_legend = T, b=true_pos)
Almost same result as using optim
method.
First, try out the simple
method in MASH to verify that it agrees with the multivariate prior (although it does, at least as far as our unit test can tell us),
start = Sys.time()
m_init = mvsusieR:::MashInitializer$new(list(dat$V), 1, 1, 0)
res2 = mvsusie(dat$X,dat$y,L=L,prior_variance=m_init,estimate_prior_variance=T,precompute_covariances=T,compute_objective=T,estimate_residual_variance=F, estimate_prior_method='simple')
Sys.time() - start
res2$elbo
res2$V
Now try EM
method and see that agrees with the multivariate prior (although it does, at least as far as our unit test can tell us),
start = Sys.time()
m_init = mvsusieR:::MashInitializer$new(list(dat$V), 1, 1, 0)
res2 = mvsusie(dat$X,dat$y,L=L,prior_variance=m_init,estimate_prior_variance=T,precompute_covariances=F,compute_objective=T,estimate_residual_variance=F, estimate_prior_method='EM')
Sys.time() - start
res2$elbo
res2$V