Multivariate Bayesian variable selection regression

Summary

We suspect that for most esitmates of the mixing proportions, we will get (1 - e, ..., e,...), where e is some small number. This write-up is to see if the Dirichlet can get something like this by tweaking the parameters.

Fiddle with parameters

If you have a very high concentration parameter, then most of the time only a few values will be non-zero (i.e. > 10 ^ -5). This is good since the true eps is 10^-5.

In [1]:
n <- 1000
eps <- 10^-2
conc <- 100
true_pi <- c(1 - eps, rep(eps / (n - 1), n - 1))
alpha <- true_pi * conc

nsamp <- 100
dout <- gtools::rdirichlet(n = nsamp, alpha = alpha)

plot(sort(dout[sample(1:nsamp, 1), ], decreasing = TRUE)[2:100], type = "h")
plot(sort(dout[sample(1:nsamp, 1), ], decreasing = TRUE)[2:100], type = "h")
plot(sort(dout[sample(1:nsamp, 1), ], decreasing = TRUE)[2:100], type = "h")
plot(sort(dout[sample(1:nsamp, 1), ], decreasing = TRUE)[2:100], type = "h")


library(sirt)
mle_out <- sirt::dirichlet.mle(x = as.data.frame(dout))

pi_est <- mle_out$alpha / mle_out$alpha0
plot(pi_est[-1], true_pi[-1])
- sirt 3.9-4 (2020-02-17 12:57:09)

Ad-hoc sampleing of alphas

Idea: sample one of the alphas to be the "eps" component and make this one larger than the other alphas. Then sample from a dirichlet.

In [2]:
rm(list = ls())
eps <- 10^-2
n <- 1000
conc <- 100
true_pi <- c(1 - eps, rep(eps / (n - 1), n - 1))

Copyright © 2016-2020 Gao Wang et al at Stephens Lab, University of Chicago