Harmonise the alleles and effects between the exposure and outcome
Source:R/harmonise.R
harmonise_data.Rd
In order to perform MR the effect of a SNP on an outcome and exposure must be harmonised to be relative to the same allele.
Arguments
- exposure_dat
Output from
read_exposure_data()
.- outcome_dat
Output from
extract_outcome_data()
.- action
Level of strictness in dealing with SNPs.
action = 1
: Assume all alleles are coded on the forward strand, i.e. do not attempt to flip allelesaction = 2
: Try to infer positive strand alleles, using allele frequencies for palindromes (default, conservative);action = 3
: Correct strand for non-palindromic SNPs, and drop all palindromic SNPs from the analysis (more conservative). If a single value is passed then this action is applied to all outcomes. But multiple values can be supplied as a vector, each element relating to a different outcome.
Details
Expects data in the format generated by read_exposure_data()
and extract_outcome_data()
.
This means the inputs must be dataframes with the following columns:
outcome_dat
:
SNP
beta.outcome
se.outcome
effect_allele.outcome
other_allele.outcome
eaf.outcome
outcome
exposure_dat
:
SNP
beta.exposure
se.exposure
effect_allele.exposure
other_allele.exposure
eaf.exposure
The function tries to harmonise INDELs. If they are coded as sequence strings things work more smoothly. If they are coded as D/I in one dataset it will try to convert them to sequences if the other dataset has adequate information. If coded as D/I in one dataset and as a variant with equal length INDEL alleles in the other, the variant is dropped. If one or both the datasets only has one allele (i.e. the effect allele) then harmonisation is naturally going to be more ambiguous and more variants will be dropped.