Conditional analysis of VCF files can be performed using GCTA’s COJO routine. The procedure implemented here is as follows
Ultimately, a list of results will be returned where every fine-mapped variant has a regional set of summary data that is conditionally independent of all neighbouring fine-mapped variants.
Setup:
vcffile <- "ieu-a-300.vcf.gz" ldref <- "/Users/gh13047/repo/mr-base-api/app/ld_files/EUR" gwasvcf::set_bcftools()
Perform susieR pipeline:
out <- susieR_pipeline( vcffile=vcffile, bfile=ldref, plink_bin=genetics.binaRies::get_plink_binary(), pop="EUR", threads=1, L=10, estimate_residual_variance=TRUE, estimate_prior_variance=TRUE, check_R=FALSE, z_ld_weight=1/500 )
Each detected region now has a finemapped object stored against it. You can see them for example like this:
summary(out$res[[1]]$susieR) susieR::susie_plot(out$res[[1]]$susieR, y="PIP")
For each region we can extract the variants with the highest posterior inclusion probability per credible set, e.g.:
out$res[[1]]$susieR$fmset
Now we can perform conditional analysis at each region using knowledge of the finemapped variants. The cojo_cond
function does the following
The result is a list of regions, with a set of conditional summary stats for every fine-mapped variant in that region.
out2 <- cojo_cond( vcffile=vcffile, bfile=ldref, pop="EUR", snplist=unlist(sapply(out$res, function(x) x$susieR$fmset)) )
TODO