Multivariate Bayesian variable selection regression

Investigating behavior of lfsr for condition specific CS

Here I investigate behavior of lfsr of CS per condition, using singleton simulation setting.

Load data

In [1]:
Y = dscrutils:::read_dsc('small_data_1_singleton_1.pkl')
In [2]:
names(Y)
  1. 'Y'
  2. 'R'
  3. 'J'
  4. 'meta'
  5. 'DSC_DEBUG'
In [3]:
X = readRDS('small_data_1.rds')
In [4]:
names(X)
  1. 'X'
  2. 'Y'
  3. 'N'
  4. 'meta'
  5. 'DSC_DEBUG'
In [5]:
prior = dscrutils:::read_dsc('oracle_generator_1.pkl')$configurations$mixture_1
names(prior)
  1. 'xUlist'
  2. 'pi'
  3. 'null_weight'

Setup M&M model

In [6]:
m_init = mvsusieR:::MashInitializer$new(NA, NA, xUlist=prior$xUlist, prior_weights=prior$pi, null_weight=prior$null_weight, top_mixtures=0, alpha=0)
result = mvsusieR::mvsusie(X$X, Y$Y, L=10, prior_variance=m_init)
In [7]:
true_idx = which(apply(Y$meta$true_coef, 1, sum) != 0)
Y$meta$true_coef[true_idx,]
  1. -0.452527792535091
  2. 0
  3. 0
  4. 0
  5. 0
In [8]:
true_idx
136
In [9]:
names(result)
  1. 'alpha'
  2. 'b1'
  3. 'b2'
  4. 'KL'
  5. 'lbf'
  6. 'V'
  7. 'sigma2'
  8. 'elbo'
  9. 'niter'
  10. 'convergence'
  11. 'coef'
  12. 'null_index'
  13. 'mixture_weights'
  14. 'lfsr'
  15. 'fitted'
  16. 'intercept'
  17. 'walltime'
  18. 'sets'
  19. 'pip'
In [10]:
dim(result$mixture_weights)
  1. 10
  2. 1001
  3. 41
In [11]:
result$pip[true_idx]
0.261959378011549
In [12]:
result$sets
$cs
$L1 =
  1. 63
  2. 119
  3. 128
  4. 130
  5. 136
$purity
A df[,3]: 1 × 3
min.abs.corrmean.abs.corrmedian.abs.corr
<dbl><dbl><dbl>
L10.96850250.98505410.9828454
$cs_index
1
$coverage
0.95
In [13]:
pip = mvsusieR:::mvsusie_get_pip_per_condition(result, m_init)
In [14]:
pip[true_idx,]
  1. 0.258636973677318
  2. 0.0117376627912459
  3. 0.0107837406802076
  4. 0.0107549548100089
  5. 0.0113016905399297

This looks perfectly Okay.

In [15]:
result$alpha
A matrix: 10 × 1001 of type dbl[,1001]
7.067327e-167.067327e-167.214058e-167.283524e-167.294216e-167.067327e-167.278109e-167.403302e-167.163583e-167.117373e-161.114710e-151.054611e-155.451146e-168.089631e-165.909681e-166.660415e-165.902251e-167.841971e-162.079720e-151.147548e-15
8.456368e-048.456368e-049.080680e-049.187946e-048.788154e-048.456368e-049.195650e-049.326527e-048.517949e-048.945232e-041.084412e-031.046232e-036.943564e-047.913427e-047.265263e-047.943049e-047.235550e-049.664881e-041.975785e-031.267444e-03
8.456994e-048.456994e-049.081379e-049.188656e-048.788871e-048.456994e-049.196359e-049.327254e-048.518587e-048.945917e-041.084476e-031.046295e-036.944051e-047.913989e-047.265771e-047.943647e-047.236057e-049.665675e-041.975903e-031.267570e-03
8.457743e-048.457743e-049.082277e-049.189569e-048.789760e-048.457743e-049.197273e-049.328190e-048.519354e-048.946797e-041.084557e-031.046375e-036.944659e-047.914684e-047.266398e-047.944373e-047.236681e-049.666605e-041.976039e-031.267726e-03
8.458384e-048.458384e-049.083083e-049.190390e-048.790538e-048.458384e-049.198095e-049.329032e-048.520012e-048.947587e-041.084629e-031.046445e-036.945193e-047.915290e-047.266944e-047.945000e-047.237225e-049.667390e-041.976150e-031.267861e-03
8.458780e-048.458780e-049.083618e-049.190935e-048.791035e-048.458780e-049.198641e-049.329592e-048.520421e-048.948111e-041.084675e-031.046491e-036.945536e-047.915677e-047.267291e-047.945395e-047.237571e-049.667867e-041.976215e-031.267948e-03
8.458891e-048.458891e-049.083814e-049.191135e-048.791193e-048.458891e-049.198843e-049.329798e-048.520539e-048.948303e-041.084691e-031.046506e-036.945648e-047.915798e-047.267400e-047.945513e-047.237679e-049.667989e-041.976228e-031.267976e-03
8.458755e-048.458755e-049.083702e-049.191022e-048.791052e-048.458755e-049.198731e-049.329684e-048.520403e-048.948193e-041.084679e-031.046494e-036.945554e-047.915686e-047.267299e-047.945389e-047.237578e-049.667808e-041.976197e-031.267952e-03
8.458468e-048.458468e-049.083389e-049.190704e-048.790722e-048.458468e-049.198413e-049.329358e-048.520111e-048.947886e-041.084649e-031.046464e-036.945329e-047.915426e-047.267065e-047.945115e-047.237345e-049.667442e-041.976142e-031.267894e-03
8.458161e-048.458161e-049.083028e-049.190338e-048.790361e-048.458161e-049.198047e-049.328983e-048.519798e-048.947533e-041.084616e-031.046432e-036.945081e-047.915142e-047.266808e-047.944818e-047.237089e-049.667056e-041.976086e-031.267831e-03
In [16]:
lfsr = mvsusieR::mvsusie_get_cs_lfsr(result)
lfsr
A matrix: 10 × 5 of type dbl[,5]
4.97602e-130.98083080.98660150.98798550.9813265
1.00000e+001.00000001.00000001.00000001.0000000
1.00000e+001.00000001.00000001.00000001.0000000
1.00000e+001.00000001.00000001.00000001.0000000
1.00000e+001.00000001.00000001.00000001.0000000
1.00000e+001.00000001.00000001.00000001.0000000
1.00000e+001.00000001.00000001.00000001.0000000
1.00000e+001.00000001.00000001.00000001.0000000
1.00000e+001.00000001.00000001.00000001.0000000
1.00000e+001.00000001.00000001.00000001.0000000

This also looks good.

Another dataset

Now let me try another data-set,

In [17]:
Y = dscrutils:::read_dsc('small_data_100_singleton_1.pkl')
X = readRDS('small_data_100.rds')
prior = dscrutils:::read_dsc('oracle_generator_1.pkl')$configurations$mixture_1
m_init = mvsusieR:::MashInitializer$new(NA, NA, xUlist=prior$xUlist, prior_weights=prior$pi, null_weight=prior$null_weight, top_mixtures=0, alpha=0)
result = mvsusieR::mvsusie(X$X, Y$Y, L=10, prior_variance=m_init)
true_idx = which(apply(Y$meta$true_coef, 1, sum) != 0)
Y$meta$true_coef[true_idx,]
  1. 0
  2. 0
  3. 0
  4. 0
  5. -0.200455126526174
In [18]:
pip = mvsusieR:::mvsusie_get_pip_per_condition(result, m_init)
pip[true_idx,]
  1. 0.00365364540002411
  2. 0.00375696607886711
  3. 0.00365410916258646
  4. 0.0036584095367701
  5. 0.178062092788515

This looks good.

lfsr extraction

In [19]:
lfsr = mvsusieR::mvsusie_get_cs_lfsr(result)
In [20]:
lfsr
A matrix: 10 × 5 of type dbl[,5]
0.9935090.99167180.99406030.99368092.268206e-09
1.0000001.00000001.00000001.00000001.000000e+00
1.0000001.00000001.00000001.00000001.000000e+00
1.0000001.00000001.00000001.00000001.000000e+00
1.0000001.00000001.00000001.00000001.000000e+00
1.0000001.00000001.00000001.00000001.000000e+00
1.0000001.00000001.00000001.00000001.000000e+00
1.0000001.00000001.00000001.00000001.000000e+00
1.0000001.00000001.00000001.00000001.000000e+00
1.0000001.00000001.00000001.00000001.000000e+00

This also looks good.


Copyright © 2016-2020 Gao Wang et al at Stephens Lab, University of Chicago