Multivariate Bayesian variable selection regression

Scaling the prior matrices from empirical Bayes multivariate analysis

Matrices provided from fitting empirical Bayes normal means problem (EBNM) using exchangable standard effects model ($\beta/s | s \sim g(\cdot)$) must be scaled by residual variance and sample size in msSuSiE analysis, when the variables matrix $X$ is standardized.

Example

Load data-set:

In [1]:
library(susieR)
data("N3finemapping")
attach(N3finemapping)
In [2]:
get_sumstat = function(X,Y, standardize=F) {
    if (standardize) X = scale(X)
    ss1 = susieR:::univariate_regression(X, Y[,1])
    ss2 = susieR:::univariate_regression(X, Y[,2])
    bhat = cbind(ss1$betahat, ss2$betahat)
    sbhat = cbind(ss1$sebetahat, ss2$sebetahat)
    return(list(b=bhat,s=sbhat))
}

First, we obtain effect size and their standard errors from univariate analysis,

In [3]:
out = get_sumstat(X,Y)

The prior for effect size, assuming it is obtained by the EBNM, should be at the scale of:

In [4]:
U_EE = cov(out$b)
U_EE
A matrix: 2 × 2 of type dbl
0.39203072-0.04737559
-0.04737559 0.09786247

However, we often perform EBNM under exchangable standardized effects model, resulting in matrices at the scale of:

In [5]:
U_EZ = cov(out$b/out$s)
U_EZ
A matrix: 2 × 2 of type dbl
3.937061-1.262135
-1.262135 3.968066
In [6]:
get_sigma = function(Y) {
    sigma = sapply(1:ncol(Y), function(i) sd(Y[,i], na.rm=T))
    N = sapply(1:ncol(Y), function(i) length(which(!is.na(Y[,1]))))
    sigma/sqrt(N)
}
In [7]:
sigma = get_sigma(Y)

We cannot convert U_EZ back to U_EE for use with mvSuSiE analysis because is no obvious connection between U_EE and U_EZ

In [8]:
t(U_EZ * sigma) * sigma
A matrix: 2 × 2 of type dbl
0.053791019-0.008989528
-0.008989528 0.014733425

When $X$ is standardized,

In [9]:
out.s = get_sumstat(X,Y,T)
In [10]:
U_EE = cov(out.s$b)
U_EE
A matrix: 2 × 2 of type dbl
0.052610656-0.008683745
-0.008683745 0.014293509
In [11]:
U_EZ = cov(out.s$b/out.s$s)
U_EZ
A matrix: 2 × 2 of type dbl
3.937061-1.262135
-1.262135 3.968066

We can recover U_EE from U_EZ via:

In [12]:
t(U_EZ * sigma) * sigma
A matrix: 2 × 2 of type dbl
0.053791019-0.008989528
-0.008989528 0.014733425

Notice that here U_EZ remains the same regardless of $X$ being standardized beforehand.

Conclusion

Our EBNM based prior workflow should be:

  1. Obtain U_EZ from external data using EBNM. It does not matter whether or not the external data is a result of scaled $X$ or not because we use exchangable standardized effect model anyways
  2. In mvSuSiE analysis we have to standardize $X$, then obtain U_EE from U_EZ as outlined above. The U_EZ thus generated will match the scale of the effect size for standardized $X$

Copyright © 2016-2020 Gao Wang et al at Stephens Lab, University of Chicago