This vignette shows multi-tissue fine mapping using the "Sum of Single Effects" (SuSIE) approach.
This version implements a modular version that relies heavily on mashr
. In order to keep track of various quantities of interest and try multiple alternatives, the prototype is done as a DSC problem in src/model_dsc
.
%cd ../src/model_dsc
./model.dsc -h
For the time being:
get_data
load gene specific data from GTExget_Y
either simulate expression data Y
, or use the originalinit_model
perform preliminary mash
analysis to obtain the mixture priorfit
fit the m&m model: maxL
is max number of single effects, maxI
is number of iterationsdiagnose
compute ELBO etcAs a first pass I extracted data for FMO2 on Thyroid and Lung. I use a maximum of 5 effects and 10 iterations of variational updates:
./model.dsc
Scripts for the DSC can be found here. Specifically look at the fit
module for m&m core computations.
res = readRDS('mnm_model/fit_mnm/get_data_1_original_Y_1_init_mnm_1_fit_mnm_1.rds')
dat = readRDS('mnm_model/init_mnm/get_data_1_original_Y_1_init_mnm_1.rds')
r2 = dat$data$r2
We focus on inferred posterior mean. We plot this quantity in both Thyroid and Lung tissues, annotated by local false sign rate (lfsr) and LD structure.
post_mean = res$posterior$PosteriorMean
post_cov = res$posterior$PosteriorCov
lfsr = res$posterior$lfsr
lfdr = res$posterior$lfdr
n_in_CI = res$posterior$n_in_CI
in_CI = res$posterior$in_CI
%get post_mean post_cov lfsr lfdr r2 n_in_CI in_CI --from R
import seaborn as sns
import matplotlib.pyplot as plt
def plot_beta(yaxis, zaxis, ld, ci, conf):
xaxis = [x+1 for x in range(len(yaxis))]
cmap = sns.cubehelix_palette(start=2.8, rot=.1, as_cmap=True)
f, ax = plt.subplots(figsize=(18,5))
points = ax.scatter(xaxis, yaxis, c=zaxis, cmap=cmap)
f.colorbar(points, label=conf['zlabel'])
if 'CIMax' in conf:
for idx in range(ci[1].shape[0]):
if ci[0][idx] < conf['CIMax']:
for ii, xx in enumerate(yaxis):
if ci[1][idx, ii] > 0:
ax.scatter(xaxis[ii], yaxis[ii], s=80,
facecolors='none', edgecolors='#D3D3D3')
if 'pip_cutoff' in conf:
for idx, item in enumerate(zaxis):
if item > conf['pip_cutoff']:
ax.scatter(xaxis[idx], yaxis[idx], s=80,
facecolors='none', edgecolors='r')
for ii, xx in enumerate(ld[idx,:]):
if xx > conf['ld_cutoff1'] and xx < 1.0:
ax.scatter(xaxis[ii], yaxis[ii],
color='y', marker='+')
for ii, xx in enumerate(ld[idx,:]):
if xx > conf['ld_cutoff2'] and xx < 1.0:
ax.scatter(xaxis[ii], yaxis[ii],
color='g', marker='x')
ax.set_title(conf['title'])
ax.set_ylabel(conf['ylabel'])
plt.gca()
plt.show()
Top signals ($lfsr < 0.05$) are circled in red, with SNPs in LD with it ($r^2>0.1$) colored in yellow, ($r^2>0.6$) colored in green.
conf = {'title': 'FMO2, Thyroid, M&M ASH',
'ylabel': 'betahat',
'zlabel': 'PIP (1 - lfsr)',
'pip_cutoff': 0.95,
'ld_cutoff1': 0.1,
'ld_cutoff2': 0.6,
'CIMax': 50}
plot_beta(post_mean[:,0], 1 - lfsr[:,0], r2, (n_in_CI, in_CI), conf)
conf = {'title': 'FMO2, Lung, M&M ASH',
'ylabel': 'betahat',
'zlabel': 'PIP (1 - lfsr)',
'pip_cutoff': 0.95,
'ld_cutoff1': 0.1,
'ld_cutoff2': 0.6,
'CIMax': 50}
plot_beta(post_mean[:,1], 1 - lfsr[:,1], r2, (n_in_CI, in_CI), conf)