This RMarkdown document demonstrates how key elements from the notebook for case study 2 in the EpiGraphDB paper can be achieved using the R package. For detailed explanations of the case study please refer to the paper or the case study notebook.

Context

Systematic MR of molecular phenotypes such as proteins and expression of transcript levels offer enormous potential to prioritise drug targets for further investigation. However, many genes and gene products are not easily druggable, so some potentially important causal genes may not offer an obvious route to intervention.

A parallel problem is that current GWASes of molecular phenotypes have limited sample sizes and limited protein coverages. A potential way to address both these problems is to use protein-protein interaction information to identify druggable targets which are linked to a non-druggable, but robustly causal target. Their relationship to the causal target increases our confidence in their potential causal role even if the initial evidence of effect is below our multiple-testing threshold.

Here in case study 2 we demonstrate an approach to use data in EpiGraphDB to prioritise potential alternative drug targets in the same PPI network, as follows:

  • For an existing drug target of interests, we use PPI networks to search for its directly interacting genes that are evidenced to be druggable.
  • We then examine the causal evidence of these candidate genes on the disease.
  • We also examine the literature evidence of these candidate genes on the disease.

The triangulation of MR evidence and literature evidence as available from EpiGraphDB regarding these candidate genes will greatly enhance our confidence in identifying potential viable drug targets.

library("magrittr")
library("dplyr")
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library("purrr")
#> 
#> Attaching package: 'purrr'
#> The following object is masked from 'package:magrittr':
#> 
#>     set_names
library("glue")
#> 
#> Attaching package: 'glue'
#> The following object is masked from 'package:dplyr':
#> 
#>     collapse
library("epigraphdb")
#> 
#>     EpiGraphDB v0.3 (API: https://api.epigraphdb.org)
#> 

Here we configure the parameters used in the case study example. We illustrate this approach using IL23R, an established drug target for inflammatory bowel disease (IBD) (Duerr et al., 2006; Momozawa et al., 2011).

While specific IL23R interventions are still undergoing trials, there is a possibility that these therapies may not be effective for all or even the majority of patients. This case study therefore explores potential alternative drug targets.

GENE_NAME <- "IL23R"
OUTCOME_TRAIT <- "Inflammatory bowel disease"

Using Mendelian randomization results for causal effect estimation

The next step is to find out whether any of these genes have a comparable and statistically plausible effect on IBD.

Here we search EpiGraphDB for the Mendelian randomization (MR) results for these genes and IBD from the recent study by Zheng et al, 2019 (https://epigraphdb.org/xqtl/) via the GET /xqtl/single-snp-mr endpoint.

extract_mr <- function(outcome_trait, gene_list, qtl_type) {
  endpoint <- "/xqtl/single-snp-mr"
  per_gene <- function(gene_name) {
    params <- list(
      exposure_gene = gene_name,
      outcome_trait = outcome_trait,
      qtl_type = qtl_type,
      pval_threshold = 1e-5
    )
    df <- query_epigraphdb(route = endpoint, params = params, mode = "table")
    df
  }
  res_df <- gene_list %>% map_df(per_gene)
  res_df
}

xqtl_df <- c("pQTL", "eQTL") %>% map_df(function(qtl_type) {
  extract_mr(
    outcome_trait = OUTCOME_TRAIT,
    gene_list = gene_list,
    qtl_type = qtl_type
  ) %>%
    mutate(qtl_type = qtl_type)
})
xqtl_df
#> # A tibble: 9 x 9
#>   gene.ensembl_id gene.name gwas.id gwas.trait r.beta   r.se      r.p r.rsid
#>   <chr>           <chr>     <chr>   <chr>       <dbl>  <dbl>    <dbl> <chr> 
#> 1 ENSG00000162594 IL23R     ieu-a-… Inflammat…  1.50  0.0546 0.       rs115…
#> 2 ENSG00000113302 IL12B     ieu-a-… Inflammat…  0.418 0.0345 9.59e-34 rs492…
#> 3 ENSG00000162594 IL23R     ieu-a-… Inflammat…  0.887 0.0644 4.16e-43 rs206…
#> 4 ENSG00000164136 IL15      ieu-a-… Inflammat… -1.42  0.197  5.53e-13 rs753…
#> 5 ENSG00000113520 IL4       ieu-a-… Inflammat…  0.460 0.0840 4.47e- 8 rs207…
#> 6 ENSG00000096968 JAK2      ieu-a-… Inflammat… -1.90  0.204  1.32e-20 rs478…
#> 7 ENSG00000109320 NFKB1     ieu-a-… Inflammat…  0.974 0.174  2.16e- 8 rs476…
#> 8 ENSG00000143365 RORC      ieu-a-… Inflammat… -0.995 0.116  1.21e-17 rs484…
#> 9 ENSG00000168610 STAT3     ieu-a-… Inflammat…  0.597 0.0757 2.96e-15 rs105…
#> # … with 1 more variable: qtl_type <chr>

Using literature evidence for results enrichment and triangulation

Can we find evidence in the literature where these genes are found to be associated with IBD to increase our level of confidence in MR results or to provide alternative evidence where MR results to not exist?

We can use the GET /gene/literature endpoint to get data on the literature evidence for the set of genes.

extract_literature <- function(outcome_trait, gene_list) {
  per_gene <- function(gene_name) {
    endpoint <- "/gene/literature"
    params <- list(
      gene_name = gene_name,
      object_name = outcome_trait %>% stringr::str_to_lower()
    )
    df <- query_epigraphdb(route = endpoint, params = params, mode = "table")
    df
  }
  res_df <- gene_list %>% map_df(per_gene)
  res_df %>%
    mutate(literature_count = map_int(pubmed_id, function(x) length(x)))
}

literature_df <- extract_literature(
  outcome_trait = OUTCOME_TRAIT,
  gene_list = gene_list
)
literature_df
#> # A tibble: 45 x 5
#>    pubmed_id  gene.name st.predicate     st.object_name         literature_count
#>    <list>     <chr>     <chr>            <chr>                             <int>
#>  1 <chr [2]>  IL23R     NEG_ASSOCIATED_… Inflammatory Bowel Di…                2
#>  2 <chr [1]>  IL23R     AFFECTS          Inflammatory Bowel Di…                1
#>  3 <chr [21]> IL23R     ASSOCIATED_WITH  Inflammatory Bowel Di…               21
#>  4 <chr [1]>  IL23R     PREDISPOSES      Inflammatory Bowel Di…                1
#>  5 <chr [2]>  CSF2      ASSOCIATED_WITH  Inflammatory Bowel Di…                2
#>  6 <chr [1]>  CSF2      AFFECTS          Inflammatory Bowel Di…                1
#>  7 <chr [3]>  IFNA1     ASSOCIATED_WITH  Inflammatory Bowel Di…                3
#>  8 <chr [1]>  IFNA1     PREVENTS         Inflammatory Bowel Di…                1
#>  9 <chr [2]>  IFNG      ASSOCIATED_WITH  Inflammatory Bowel Di…                2
#> 10 <chr [1]>  IFNG      AFFECTS          Inflammatory Bowel Di…                1
#> # … with 35 more rows

sessionInfo

sessionInfo()
#> R version 4.0.2 (2020-06-22)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 16.04.6 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/openblas-base/libblas.so.3
#> LAPACK: /usr/lib/libopenblasp-r0.2.18.so
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] epigraphdb_0.2.1 glue_1.4.2       purrr_0.3.4      dplyr_1.0.2     
#> [5] magrittr_1.5    
#> 
#> loaded via a namespace (and not attached):
#>  [1] knitr_1.30         tidyselect_1.1.0   R6_2.4.1           ragg_0.3.1        
#>  [5] rlang_0.4.7        fansi_0.4.1        httr_1.4.2         stringr_1.4.0     
#>  [9] tools_4.0.2        xfun_0.18          utf8_1.1.4         cli_2.0.2         
#> [13] htmltools_0.5.0    systemfonts_0.3.2  ellipsis_0.3.1     yaml_2.2.1        
#> [17] assertthat_0.2.1   rprojroot_1.3-2    digest_0.6.25      tibble_3.0.3      
#> [21] lifecycle_0.2.0    pkgdown_1.6.1.9000 crayon_1.3.4       vctrs_0.3.4       
#> [25] fs_1.5.0           curl_4.3           memoise_1.1.0      evaluate_0.14     
#> [29] rmarkdown_2.4      stringi_1.5.3      pillar_1.4.6       compiler_4.0.2    
#> [33] desc_1.2.0         generics_0.0.2     backports_1.1.10   jsonlite_1.7.1    
#> [37] pkgconfig_2.0.3