Comparison of rsid vs chr:pos lookup in ElasticSearch
Gibran Hemani
Source:vignettes/timings.Rmd
timings.Rmd
Get a set of SNPs to query:
1. Comparison of rsid lookup vs chromosome:position
mbm <- microbenchmark("rsid" = {
b <- associations(rsid, "ieu-a-7", proxies=0)
},
"chrpos" = {
b <- associations(chrpos, "ieu-a-7", proxies=0)
}, times=10
)
kable(summary(mbm))
2. Single range query
Specify a range, and then get the corresponding rsIDs in that range
radius <- 100000
chrpos <- paste0(a$chr[1], ":", a$position[1]-100000, "-", a$position[1]+100000)
b <- associations(chrpos, "ieu-a-7", proxies=0)
rsid <- b$rsid
Window size
format(radius * 2, scientific = FALSE, big.mark = ",")
length(rsid)
mbm <- microbenchmark("rsid" = {
b <- associations(rsid, "ieu-a-7", proxies=0)
},
"chrpos" = {
b <- associations(chrpos, "ieu-a-7", proxies=0)
}, times=10
)
kable(summary(mbm))
3. Multiple range queries
Specify multiple ranges, and then get the corresponding rsIDs across all those ranges
chrpos <- paste0(a$chr, ":", a$position-10000, "-", a$position+10000)
b <- associations(chrpos, "ieu-a-7", proxies=0)
rsid <- b$rsid
length(chrpos)
ranges of
format(20000, scientific = FALSE, big.mark = ",")
size window, which is covered by
length(rsid)
variants to lookup.
mbm <- microbenchmark("rsid" = {
b <- associations(rsid, "ieu-a-7", proxies=0)
},
"chrpos" = {
b <- associations(chrpos, "ieu-a-7", proxies=0)
}, times=5
)
kable(summary(mbm))