Multivariate Bayesian variable selection regression

Investigating sQTL analysis results of interest: a deeper look

From this notebook we've got some examples, particularly for the 4 QTL case where the smallest p-value has small PIP, and it seems to be driven by adjunct signal clusters. We want to see if SuSiE's results makes more sense, specifically if it falls in splice sites.

In [1]:
%cd /project/compbio/jointLCLs/results/SuSiE/fastqtl_qqnorm_ASintron_RNAseqGeuvadis_YangVCF_100Kb
/project/compbio/jointLCLs/results/SuSiE/fastqtl_qqnorm_ASintron_RNAseqGeuvadis_YangVCF_100Kb

4 QTL case

Notice that below is the same intron cluster as the 5 QTL case above. It has perfectly correlated signals. The top z-score has PIP zero.

In [2]:
%preview SuSiE_CS_4/AS_chr12_27542229_27543032_clu_562.png
%preview SuSiE_CS_4/AS_chr12_27542229_27543032_clu_562.png
> SuSiE_CS_4/AS_chr12_27542229_27543032_clu_562.png (180.2 KiB):

Load data and get positions of interest

Load SuSiE results

In [3]:
dat = readRDS('SuSiE_CS_4/AS_chr12_27542229_27543032_clu_562.rds')
In [4]:
names(dat)
  1. 'input'
  2. 'idx'
  3. 'fitted'
  4. 'sets'
  5. 'pip'

Load input data

In [12]:
print(dat$input)
input = readRDS(dat$input)
[1] "/project/compbio/jointLCLs/results/SuSiE/fastqtl_qqnorm_ASintron_RNAseqGeuvadis_YangVCF_100Kb/chr12/chr12_27542229_27543041.rds"
In [14]:
sapply(1:length(input), function(i) input[[i]]$uname)
  1. 'AS_chr12_27542229_27543029_clu_562'
  2. 'AS_chr12_27542229_27543032_clu_562'
  3. 'AS_chr12_27542229_27543041_clu_562'

So it is the 2nd data-set in the input intron cluster that we are interested in.

In [15]:
input = input[[2]]
input$uname
'AS_chr12_27542229_27543032_clu_562'

Now let's identify the SNP having the largest z-score,

In [18]:
which.max(abs(input$z_score))
rs55880072_chr12_27612517_A_C: 346

and the ones in SuSiE CS:

In [25]:
dat$sets$cs
$L2
166
$L4
344
$L5
  1. 202
  2. 207
  3. 272
  4. 284
  5. 288
  6. 302
  7. 316
  8. 317
$L3
  1. 159
  2. 163
  3. 165
  4. 203
  5. 204
  6. 205
  7. 212
In [23]:
interested = as.vector(unlist(dat$sets$cs))
interested
  1. 166
  2. 344
  3. 202
  4. 207
  5. 272
  6. 284
  7. 288
  8. 302
  9. 316
  10. 317
  11. 159
  12. 163
  13. 165
  14. 203
  15. 204
  16. 205
  17. 212
In [24]:
names(input$z_score)[interested]
  1. 'rs76733902_chr12_27542629_T_A'
  2. 'rs11049018_chr12_27612107_A_G'
  3. 'rs149761989_chr12_27559962_A_G'
  4. 'rs145210967_chr12_27561492_A_G'
  5. 'rs10842915_chr12_27591756_T_C'
  6. 'rs12306687_chr12_27597822_G_A'
  7. 'rs150330389_chr12_27598897_C_T'
  8. 'rs11049014_chr12_27602813_G_A'
  9. 'rs113091867_chr12_27606290_G_A'
  10. 'rs112473472_chr12_27606682_G_A'
  11. 'rs76926063_chr12_27536489_C_G'
  12. 'rs77927877_chr12_27540821_T_C'
  13. 'rs113228767_chr12_27542493_A_G'
  14. 'rs147878198_chr12_27560041_C_G'
  15. 'rs145679811_chr12_27560707_G_T'
  16. 'rs147715803_chr12_27561169_A_C'
  17. 'rs111384326_chr12_27564097_A_G'

UCSC query

Now let's focus on the one SuSiE SNP (rs11049018_chr12_27612107_A_G) nearest to the top z-score (rs55880072_chr12_27612517_A_C), . PIP of this SNP is 1. We check in UCSC browser chr12:27,612,000-27,612,600 that includes both this and the top SNP.

In [26]:
%preview ~/tmp/arntl2.png
%preview ~/tmp/arntl2.png
> /home/gaow/tmp/arntl2.png (31.7 KiB):

Hard to argue about anything interesting.

5 QTL case

In [27]:
%preview SuSiE_CS_5/AS_chr12_27542229_27543041_clu_562.png
%preview SuSiE_CS_5/AS_chr12_27542229_27543041_clu_562.png
> SuSiE_CS_5/AS_chr12_27542229_27543041_clu_562.png (196.3 KiB):

Load data and get positions of interest

In [28]:
dat = readRDS('SuSiE_CS_5/AS_chr12_27542229_27543041_clu_562.rds')
input = readRDS(dat$input)
sapply(1:length(input), function(i) input[[i]]$uname)
  1. 'AS_chr12_27542229_27543029_clu_562'
  2. 'AS_chr12_27542229_27543032_clu_562'
  3. 'AS_chr12_27542229_27543041_clu_562'
In [29]:
input = input[[3]]
which.max(abs(input$z_score))
rs55880072_chr12_27612517_A_C: 346
In [31]:
dat$sets$cs
$L2
166
$L4
357
$L3
  1. 159
  2. 163
  3. 165
  4. 203
  5. 204
  6. 205
  7. 212
$L5
  1. 273
  2. 287
$L1
  1. 346
  2. 364
  3. 371
  4. 372
In [30]:
interested = as.vector(unlist(dat$sets$cs))
interested
  1. 166
  2. 357
  3. 159
  4. 163
  5. 165
  6. 203
  7. 204
  8. 205
  9. 212
  10. 273
  11. 287
  12. 346
  13. 364
  14. 371
  15. 372
In [32]:
names(input$z_score)[interested]
  1. 'rs76733902_chr12_27542629_T_A'
  2. 'rs67786386_chr12_27616181_C_A'
  3. 'rs76926063_chr12_27536489_C_G'
  4. 'rs77927877_chr12_27540821_T_C'
  5. 'rs113228767_chr12_27542493_A_G'
  6. 'rs147878198_chr12_27560041_C_G'
  7. 'rs145679811_chr12_27560707_G_T'
  8. 'rs147715803_chr12_27561169_A_C'
  9. 'rs111384326_chr12_27564097_A_G'
  10. 'rs73298732_chr12_27592252_G_A'
  11. 'rs113424189_chr12_27598865_C_T'
  12. 'rs55880072_chr12_27612517_A_C'
  13. 'rs73082065_chr12_27619827_A_G'
  14. 'rs10506020_chr12_27621942_C_T'
  15. 'rs10506021_chr12_27622549_C_G'

We focus on rs67786386_chr12_27616181_C_A vs rs55880072_chr12_27612517_A_C (same as before). We query chr12:27612517-27616381

USCS query

In [33]:
%preview ~/tmp/arntl2_right.png
%preview ~/tmp/arntl2_right.png
> /home/gaow/tmp/arntl2_right.png (26.7 KiB):

It does fall into DNase I cluster, but nothing interesting beyond this point.

In [ ]:
[global]
parameter: fn=path()

[default]
input: fn
R: expand = True
	dat = readRDS({_input:r})
	input = readRDS(dat$input)
	idx = which(sapply(1:length(input), function(i) input[[i]]$uname) == {_input:bnr})
	input = input[[idx]]
	znames = names(input$z_score)
    zidx = which.max(abs(input$z_score))
    print(zidx)
	print(znames[zidx])
	interested = as.vector(unlist(dat$sets$cs))
	print(dat$sets$cs)
	print(znames[interested])

2 QTL case

In [34]:
%preview SuSiE_CS_2/AS_chr19_50837358_50838086_clu_8171.png
%preview SuSiE_CS_2/AS_chr19_50837358_50838086_clu_8171.png
> SuSiE_CS_2/AS_chr19_50837358_50838086_clu_8171.png (202.1 KiB):
In [46]:
%sosrun --fn SuSiE_CS_2/AS_chr19_50837358_50838086_clu_8171.rds
rs3219404_chr19_50911461_G_A 
                         737 
[1] "rs3219404_chr19_50911461_G_A"
$L2
[1] 426 434 492 530

$L1
 [1] 372 373 374 375 376 377 380 381 382 386 398 400 414

 [1] "rs10408648_chr19_50856940_G_A"  "rs11881647_chr19_50857469_A_G" 
 [3] "rs115115058_chr19_50874717_G_A" "rs2695120_chr19_50880536_T_C"  
 [5] "rs140533621_chr19_50844210_C_T" "rs144378176_chr19_50844211_A_G"
 [7] "rs147907106_chr19_50844214_C_T" "rs613694_chr19_50844219_C_G"   
 [9] "rs614567_chr19_50844397_C_A"    "rs607307_chr19_50845021_G_A"   
[11] "rs137953380_chr19_50845812_C_A" "rs591161_chr19_50846281_A_G"   
[13] "rs590653_chr19_50846431_A_G"    "rs2546567_chr19_50848093_G_A"  
[15] "rs627400_chr19_50850761_A_G"    "rs625672_chr19_50851155_C_T"   
[17] "rs679288_chr19_50854158_C_T"   
INFO: Workflow default (ID=0d184be397c60cb8) is executed successfully with 1 completed step.
In [35]:
%preview SuSiE_CS_2/AS_chr19_54704756_54705028_clu_8215.png
%preview SuSiE_CS_2/AS_chr19_54704756_54705028_clu_8215.png
> SuSiE_CS_2/AS_chr19_54704756_54705028_clu_8215.png (174.2 KiB):
In [47]:
%sosrun --fn SuSiE_CS_2/AS_chr19_54704756_54705028_clu_8215.rds
rs3852890_chr19_54709944_G_A 
                         326 
[1] "rs3852890_chr19_54709944_G_A"
$L2
[1] 312

$L1
[1] 320 322 326

[1] "rs60811961_chr19_54704775_C_G" "rs3852889_chr19_54708430_A_G" 
[3] "rs10424181_chr19_54709277_C_A" "rs3852890_chr19_54709944_G_A" 
INFO: Workflow default (ID=8828483c669a7f80) is executed successfully with 1 completed step.
In [48]:
%preview ~/tmp/rps9.png
%preview ~/tmp/rps9.png
> /home/gaow/tmp/rps9.png (42.6 KiB):

SuSiE signal falls in the "Layered H3K27AC" peak.

In [36]:
%preview SuSiE_CS_2/AS_chr20_20007563_20013746_clu_9197.png
%preview SuSiE_CS_2/AS_chr20_20007563_20013746_clu_9197.png
> SuSiE_CS_2/AS_chr20_20007563_20013746_clu_9197.png (189.0 KiB):
In [37]:
%preview SuSiE_CS_2/AS_chr20_37055833_37059684_clu_9365.png
%preview SuSiE_CS_2/AS_chr20_37055833_37059684_clu_9365.png
> SuSiE_CS_2/AS_chr20_37055833_37059684_clu_9365.png (171.0 KiB):
In [50]:
%sosrun --fn SuSiE_CS_2/AS_chr20_37055833_37059684_clu_9365.rds
rs62203382_chr20_37071007_A_G 
                          386 
[1] "rs62203382_chr20_37071007_A_G"
$L1
[1] 401

$L2
[1] 504 518

[1] "rs73098016_chr20_37083100_T_A"  "rs59033379_chr20_37133858_G_A" 
[3] "rs113883801_chr20_37143769_C_T"
INFO: Workflow default (ID=c5661f95d9c07b4a) is executed successfully with 1 completed step.
In [38]:
%preview SuSiE_CS_2/AS_chr20_57570814_57572658_clu_9568.png
%preview SuSiE_CS_2/AS_chr20_57570814_57572658_clu_9568.png
> SuSiE_CS_2/AS_chr20_57570814_57572658_clu_9568.png (183.6 KiB):
In [51]:
%sosrun --fn SuSiE_CS_2/AS_chr20_57570814_57572658_clu_9568.rds
rs151362_chr20_57554311_C_A 
                        171 
[1] "rs151362_chr20_57554311_C_A"
$L3
[1] 233 236 243

$L2
[1] 189 219 227 252

[1] "rs2295357_chr20_57572839_G_T"   "rs139145965_chr20_57573596_C_T"
[3] "rs6070690_chr20_57574816_G_C"   "rs163779_chr20_57558274_T_C"   
[5] "rs184673_chr20_57567543_A_G"    "rs13720_chr20_57570568_G_A"    
[7] "rs191490_chr20_57576180_A_G"   
INFO: Workflow default (ID=622a58859ee229ac) is executed successfully with 1 completed step.
In [52]:
%preview ~/tmp/ctsz.png
%preview ~/tmp/ctsz.png
> /home/gaow/tmp/ctsz.png (56.1 KiB):
In [53]:
%preview ~/tmp/ak310046.png
%preview ~/tmp/ak310046.png
> /home/gaow/tmp/ak310046.png (34.7 KiB):
In [39]:
%preview SuSiE_CS_2/AS_chr22_30822832_30823227_clu_8715.png
%preview SuSiE_CS_2/AS_chr22_30822832_30823227_clu_8715.png
> SuSiE_CS_2/AS_chr22_30822832_30823227_clu_8715.png (224.4 KiB):
In [54]:
%sosrun --fn SuSiE_CS_2/AS_chr22_30822832_30823227_clu_8715.rds
rs7292432_chr22_30796835_C_G 
                         161 
[1] "rs7292432_chr22_30796835_C_G"
$L2
 [1]  29  30  33  57 118 177 196 276 280 390 394

$L1
[1] 156 168 175 182 197 201

 [1] "rs75265757_chr22_30739174_G_A"  "rs114214629_chr22_30740914_C_G"
 [3] "rs116485157_chr22_30742894_A_G" "rs114429377_chr22_30753931_G_A"
 [5] "rs114302396_chr22_30775694_G_A" "rs116438221_chr22_30808588_G_A"
 [7] "rs78474653_chr22_30823133_G_A"  "rs116255130_chr22_30854639_G_A"
 [9] "rs78467477_chr22_30855806_C_T"  "rs11702947_chr22_30899264_T_G" 
[11] "rs11704977_chr22_30900128_C_T"  "rs2412985_chr22_30792505_C_T"  
[13] "rs4820845_chr22_30800338_T_G"   "rs2299824_chr22_30807524_T_C"  
[15] "rs4820011_chr22_30810954_A_C"   "rs5753130_chr22_30823196_T_C"  
[17] "rs1860215_chr22_30826800_G_A"  
INFO: Workflow default (ID=d782cd38111eb59e) is executed successfully with 1 completed step.

30823133 and 30823196 both suggests one exon, in two different CS. There are also a number of other seemly interesting hits.

In [55]:
%preview ~/tmp/22-1.png
%preview ~/tmp/22-1.png
> /home/gaow/tmp/22-1.png (9.9 KiB):
In [57]:
%preview ~/tmp/22-2.png
%preview ~/tmp/22-2.png
> /home/gaow/tmp/22-2.png (10.9 KiB):
In [58]:
%preview ~/tmp/22-3.png
%preview ~/tmp/22-3.png
> /home/gaow/tmp/22-3.png (16.3 KiB):
In [59]:
%preview ~/tmp/22-4.png
%preview ~/tmp/22-4.png
> /home/gaow/tmp/22-4.png (18.4 KiB):
In [40]:
%preview SuSiE_CS_2/AS_chr5_130662593_130694902_clu_11986.png
%preview SuSiE_CS_2/AS_chr5_130662593_130694902_clu_11986.png
> SuSiE_CS_2/AS_chr5_130662593_130694902_clu_11986.png (180.2 KiB):
In [60]:
%sosrun --fn SuSiE_CS_2/AS_chr5_130662593_130694902_clu_11986.rds
rs6876350_chr5_130604109_G_T 
                          73 
[1] "rs6876350_chr5_130604109_G_T"
$L1
[1] 115

$L2
[1] 140 155 157 158 197 221 233

[1] "rs6596010_chr5_130625411_G_A" "rs6864145_chr5_130637861_C_T"
[3] "rs6421901_chr5_130648484_A_G" "rs2551036_chr5_130648994_G_A"
[5] "rs9327608_chr5_130650134_G_A" "rs193531_chr5_130674706_C_T" 
[7] "rs1088581_chr5_130682622_G_C" "rs798410_chr5_130684761_A_G" 
INFO: Workflow default (ID=5c33259ef70ecd5d) is executed successfully with 1 completed step.

Just introns, nothing exciting.

In [41]:
%preview SuSiE_CS_2/AS_chr6_29912868_29913011_clu_10776.png
%preview SuSiE_CS_2/AS_chr6_29912868_29913011_clu_10776.png
> SuSiE_CS_2/AS_chr6_29912868_29913011_clu_10776.png (240.1 KiB):
In [61]:
%sosrun --fn SuSiE_CS_2/AS_chr6_29912868_29913011_clu_10776.rds
rs148724924_chr6_29921312_C_G 
                         2607 
[1] "rs148724924_chr6_29921312_C_G"
$L1
 [1]  59  65 121 133 192 196 207 236 253 278 279 283 286 287 288 291 293 294 296
[20] 322 330 333 356 427 437 441 447 449 456 491 511 517 518 544 555 558 566

$L2
 [1] 1747 1765 1770 1794 1813 1816 1833 1834 1847 1849

 [1] "rs116527390_chr6_29817617_A_G" "rs115136886_chr6_29817938_G_A"
 [3] "rs116726743_chr6_29820655_G_T" "rs114182494_chr6_29821370_A_G"
 [5] "rs116357313_chr6_29823998_C_A" "rs114934384_chr6_29824108_T_C"
 [7] "rs116671664_chr6_29824515_G_T" "rs115482696_chr6_29825883_G_A"
 [9] "rs141143417_chr6_29827079_T_A" "rs111597677_chr6_29829251_A_C"
[11] "rs114827325_chr6_29829253_T_C" "rs115133277_chr6_29829507_T_A"
[13] "rs115550313_chr6_29829636_G_T" "rs116799141_chr6_29829813_C_T"
[15] "rs114238229_chr6_29829851_C_T" "rs116629017_chr6_29830045_A_C"
[17] "rs116311746_chr6_29830130_T_C" "rs115735410_chr6_29830213_C_T"
[19] "rs115663981_chr6_29830551_A_G" "rs114018395_chr6_29832119_T_C"
[21] "rs148128960_chr6_29832282_G_A" "rs114732676_chr6_29832407_A_T"
[23] "rs114065755_chr6_29833642_G_A" "rs113452471_chr6_29837763_A_T"
[25] "rs112740109_chr6_29838179_C_A" "rs113143254_chr6_29838335_A_G"
[27] "rs113261027_chr6_29838429_G_C" "rs78610452_chr6_29838461_G_A" 
[29] "rs111555693_chr6_29838650_A_G" "rs140081128_chr6_29839597_G_T"
[31] "rs113479709_chr6_29840810_T_C" "rs113865304_chr6_29841097_T_A"
[33] "rs113581598_chr6_29841103_A_G" "rs113428950_chr6_29842259_A_G"
[35] "rs74584731_chr6_29842573_T_C"  "rs113279442_chr6_29842697_A_C"
[37] "rs111336993_chr6_29843076_G_A" "rs114779893_chr6_29902373_A_G"
[39] "rs113496033_chr6_29902875_T_C" "rs78398725_chr6_29902949_A_C" 
[41] "rs1611338_chr6_29903675_A_C"   "rs112890311_chr6_29904131_A_G"
[43] "rs142543116_chr6_29904180_G_A" "rs114134708_chr6_29904505_T_C"
[45] "rs115656229_chr6_29904528_C_A" "rs114428469_chr6_29905101_G_T"
[47] "rs112863323_chr6_29905121_G_A"
INFO: Workflow default (ID=b29b53e39c494ec4) is executed successfully with 1 completed step.
In [43]:
%preview SuSiE_CS_2/AS_chr7_43690311_43695632_clu_10066.png
%preview SuSiE_CS_2/AS_chr7_43690311_43695632_clu_10066.png
> SuSiE_CS_2/AS_chr7_43690311_43695632_clu_10066.png (198.9 KiB):
In [62]:
%sosrun --fn SuSiE_CS_2/AS_chr7_43690311_43695632_clu_10066.rds
rs73108662_chr7_43710433_C_A 
                         241 
[1] "rs73108662_chr7_43710433_C_A"
$L2
[1] 113 122 125 135 137 138 141 144 147

$L3
 [1]  83  84  85  86  90  95 115 116 123 126 127 128 129 130 131 132 136

 [1] "rs11982135_chr7_43636793_G_A"  "rs113544863_chr7_43640420_C_T"
 [3] "rs75881624_chr7_43641750_C_T"  "rs66487644_chr7_43645292_C_A" 
 [5] "rs36113818_chr7_43648723_G_A"  "rs10282065_chr7_43649248_G_A" 
 [7] "rs10242308_chr7_43649889_T_A"  "rs10256980_chr7_43650744_A_G" 
 [9] "rs10231343_chr7_43651155_G_C"  "rs10263267_chr7_43625024_T_G" 
[11] "rs2330875_chr7_43625112_T_A"   "rs10277819_chr7_43625507_A_G" 
[13] "rs10248703_chr7_43625715_C_T"  "rs2330877_chr7_43626911_G_A"  
[15] "rs55673935_chr7_43629282_T_C"  "rs10232453_chr7_43637139_T_C" 
[17] "rs10232580_chr7_43637228_T_C"  "rs11771022_chr7_43641111_C_T" 
[19] "rs12531841_chr7_43642005_T_C"  "rs4724231_chr7_43642214_T_C"  
[21] "rs4724232_chr7_43642414_C_G"   "rs12532021_chr7_43642777_T_C" 
[23] "rs10951735_chr7_43642924_C_T"  "rs10951736_chr7_43643015_A_G" 
[25] "rs10236108_chr7_43643606_T_C"  "rs9655444_chr7_43646273_T_A"  
INFO: Workflow default (ID=c38f1b69cdd4a2ca) is executed successfully with 1 completed step.

SuSiE signal certainly closer to exons. rs73108662_chr7_43710433_C_A has nothing.

In [63]:
%preview ~/tmp/7-1.png
%preview ~/tmp/7-1.png
> /home/gaow/tmp/7-1.png (38.6 KiB):
In [64]:
%preview ~/tmp/7-2.png
%preview ~/tmp/7-2.png
> /home/gaow/tmp/7-2.png (23.5 KiB):
In [65]:
%preview ~/tmp/7-3.png
%preview ~/tmp/7-3.png
> /home/gaow/tmp/7-3.png (19.9 KiB):

Copyright © 2016-2020 Gao Wang et al at Stephens Lab, University of Chicago