class: title-slide .header[ <img src="bristol-logo.png" style="width: 200px"></img> <img src="ieu-logo.png" style="width: 200px"></img> ] # Epigenetic Epidemiology Update .large[**Matthew Suderman**] .large[**Oct 24, 2022**] --- layout: true .footer[MRC Integrative Epidemiology Unit] --- ## EWAS of exposure in adults .striped[ | pmid|journal |variable |tissue |population |results | |--------:|:-----------------|:-------------------|:------|:----------------------------|:-------| | 36246163|Epigenet Insights |psychosocial stress |blood |228 African American mothers |null | ] --- ## EWAS of phenotype in adults .striped[ | pmid|journal |variable |tissue |population |results | |--------:|:----------------|:------------------------|:----------------------------------|:-----------------------|:-----------------------------------| | 36243740|Clin Epigenetics |South Asian vs European |blood |939 (SABRE) + 916 (BiB) |16433 (76% due to cell composition) | | 36201768|Psychosom Med |optimism |blood |3816 women; 667 men |13 using 'PC-correction' | | 36180927|Clin Epigenetics |stroke outcome |blood at baseline and at discharge |643 |2 | | 36179964|Gene |age-related hearing loss |blood |57 Chinese MZ twins |18-42 | ] --- ## EWAS of exposure in children .striped[ | pmid|journal |variable |tissue |population |results | |--------:|:----------|:-------------------------------------------|:------|:-------------------------------------|:-------| | 36206092|Hum Reprod |ART embryo culture |saliva |120 children age 9 |null | | 36196007|J Nutr |periconceptional folic acid supplementation |saliva |89 mothers; 179 adolescents (Chinese) |null | ] --- ## EWAS of phenotype in children .striped[ | pmid|journal |variable |tissue |population |results | |--------:|:---------------|:----------------------|:------|:---------------------|:-------| | 36241462|Biol Psychiatry |externalizing behavior |blood |506 teens from IMAGEN |1 | ] --- ## EWAS of exposure in infants .striped[ | pmid|journal |variable |tissue |population |results | |--------:|:----------------|:----------------------------------|:--------------------|:------------------------------|:-------| | 36217170|Clin Epigenetics |fetal phthalates and bisphenols |cord blood |306 |null | | 36178055|Am J Clin Nutr |periconceptional folic acid intake |blood spots at birth |189 ALL cases and 205 controls |2 | ] --- <img src="watkins.png"></img> --- .running[DeepMR] ## Deep Mendelian randomization Malina S, Cizin D, Knowles DA. **Deep mendelian randomization: Investigating the causal knowledge of genomic deep learning models.** *PLoS Comput Biol* . doi: [10.1371/journal.pcbi.1009880](http://doi.org/10.1371/journal.pcbi.1009880) **Background** Machine learning methods have been used to predict regulatory 'marks' on DNA and RNA, e.g. transcription factor binding sites, DNA methylation levels, histone modifications, RNA splicing, etc. -- **Question**: Can we use these models to evaluate causal relationships between marks? -- **DeepMR** -- Input: - *locus* - genomic region of interest - *exposure* - a regulatory mark, e.g. TF - *outcome* - another regulatory mark, e.g. another TF - *model* - calculates probability of a mark appearing at a given DNA sequence, `P(mark | sequence)` -- Output: causal effects estimated using MR --- .running[DeepmR] **Steps:** 1. Prepare instruments -- 1a. Select random sequences from the *locus* ("reference sequences") -- 1b. Mutate references with all possible single point mutations (e.g. ACGA has mutations **T**CGA, **G**CGA, **C**CGA, A**T**GA, A**G**GA, A**C**CGA, ...) -- 1c. Calculate binding probabilities `P(exposure | sequence)` and `P(outcome | sequence)` -- Each mutant/reference pair defines an instrument. -- 1d. Effect of instrument on the exposure = `P(exposure | mutant) - P(exposure | reference)` -- 1e. Effect of instrument on the outcome = `P(outcome | mutant) - P(outcome | reference)` -- 1f. Omit instruments with small effects. -- 2. Estimate effect of exposure on the outcome by applying MR to the instruments -- 3. Estimate overall effect of exposure on the outcome by applying random effects meta-analysis. --- .running[DeepmR] The three steps visually: 1. Prepare instruments 2. Estimate effect of exposure on the outcome by applying MR to the instruments 3. Estimate overall effect of exposure on the outcome by applying random effects meta-analysis. <img src="journal.pcbi.1009880.g001.PNG" style="width: 100%"></img> --- .running[DeepmR] Using TF cooperativity analysis, [others](https://dx.doi.org/10.1038/s41588-021-00782-6) have provided evidence that *Oct4* and *Sox2* binding influences the binding of *Nanog* and *Klf4*, and that *Oct4* and *Sox2* "act on each other via a composite motif". DeepMR estimates largely agree: .center[ <img src="journal.pcbi.1009880.g003.PNG" style="width: 50%"></img> ] --- .running[regulation] ## How gene programs evolve Ringel AR ... Robson MI. **Repression and 3D-restructuring resolves regulatory conflicts in evolutionarily rearranged genomes.** *Cell* . doi: [10.1016/j.cell.2022.09.006](http://doi.org/10.1016/j.cell.2022.09.006) **Question** 1. How are gene expression programs maintained when new genes emerge in evolution? 2. How is it possible for most topologically associated domains (TADs) to contain multiple independently expressed genes? -- **Example** *Zfp42* is expressed specifically in the placenta of mammals. However, it is located in the *Fat1* TAD which is expressed in multiple mammalian tissues. It first emerged in vertebrates. --- .running[regulation] .pull-left-30[ **Results** 1) In ESCs, *Zfp42* is regulated by different enhancers than *Fat1* due to chromatin activity. 2) In embryonic limbs, *Zfp42* is inactive and does not respond to *Fat1* enhancers due to DNA methylation. ] .pull-right-70[ <img src="1-s2.0-S009286742201128X-fx1.jpg" style="width: 100%"></img> ] --- .running[cell type] Jagadeesh KA ... Regev A. **Identifying disease-critical cell types and cellular processes by integrating single-cell RNA-sequencing and human genetics.** *Nat Genet* . doi: [10.1038/s41588-022-01187-9](http://doi.org/10.1038/s41588-022-01187-9) .pull-left-40[ **Question** Given a disease-associated genetic variant, in which cell type does it confer disease? **Solution** `sc-linker` **Results** - γ-aminobutyric acid-ergic neurons linked to major depressive disorder - a disease-dependent M-cell program to ulcerative colitis - a disease-specific complement cascade process to multiple sclerosis - disease-dependent immune cell-type programs in autoimmune disease ] .pull-right-60[ **`sc-linker`** infers cell type by integrating - single-cell RNA-sequencing (11 tissues), - epigenomic SNP-to-gene maps and - genome-wide association study summary statistics (60 diseases) 1. *Construct gene programs* from scRNA. Gene programs are latent factors that differentiate cell types from one another. 2. *Link SNPs to programs* by linking them to program genes using Roadmap Enhancer-Gene Linking and Activity-by-Contact (CRISPR perturbations) 3. *Evaluate enrichment* of GWAS SNPs in a program using sLDSR ] --- .running[cell type] <img src="41588_2022_1187_Fig1.png" style="width: 100%"></img> --- .running[prediction] Cappozzo A ... Fiorito G. **A blood DNA methylation biomarker for predicting short-term risk of cardiovascular events.** *Clin Epigenetics* . doi: [10.1186/s13148-022-01341-4](http://doi.org/10.1186/s13148-022-01341-4) **Dataset** EPIC Italy cohort (n=1803, 295 CVD events) **DNAmCVDscore** derived to predict time-to-CVD event from 60 DNAm surrogates (including for BMI, blood pressure, fasting glucose and insulin, cholesterol, triglycerides, coagulation biomarkers, Gadd episcores, DNAm clocks, Grimage components, cell counts, lead) <img src="13148_2022_1341_Fig2.png" style="width: 60%"></img> --- ## Announcements * [Epigenetic Epidemiology short course](https://www.bristol.ac.uk/medical-school/study/short-courses/2021-22-courses/epigenetic-epidemiology/) 8 - 10 May 2023 * [Advanced Epigenetic Epidemiology short course](https://www.bristol.ac.uk/medical-school/study/short-courses/2021-22-courses/epigenetic-epidemiology/) 18 - 19 May 2023