Reads in exposure data. Checks and organises columns for use with MR or enrichment tests. Infers p-values when possible from beta and se. If it is the exposure then looks up SNPs in biomaRt to get basic info.

format_data(
dat,
type = "exposure",
snps = NULL,
phenotype_col = "Phenotype",
snp_col = "SNP",
beta_col = "beta",
se_col = "se",
eaf_col = "eaf",
effect_allele_col = "effect_allele",
other_allele_col = "other_allele",
pval_col = "pval",
units_col = "units",
ncase_col = "ncase",
ncontrol_col = "ncontrol",
samplesize_col = "samplesize",
gene_col = "gene",
id_col = "id",
min_pval = 1e-200,
z_col = "z",
info_col = "info",
chr_col = "chr",
pos_col = "pos",
log_pval = FALSE
)

## Arguments

dat Data frame. Must have header with at least SNP column present. Is this the exposure or the outcome data that is being read in? The default is "exposure". SNPs to extract. If NULL then doesn't extract any and keeps all. The default is NULL. The default is TRUE. Optional column name for the column with phenotype name corresponding the the SNP. If not present then will be created with the value "Outcome". The default is "Phenotype". Required name of column with SNP rs IDs. The default is "SNP". Required for MR. Name of column with effect sizes. The default is "beta". Required for MR. Name of column with standard errors. The default is "se". Required for MR. Name of column with effect allele frequency. The default is "eaf". Required for MR. Name of column with effect allele. Must contain only the characters "A", "C", "T" or "G". The default is "effect_allele". Required for MR. Name of column with non effect allele. Must contain only the characters "A", "C", "T" or "G". The default is "other_allele". Required for enrichment tests. Name of column with p-value. The default is "pval". Optional column name for units. The default is "units". Optional column name for number of cases. The default is "ncase". Optional column name for number of controls. The default is "ncontrol". Optional column name for sample size. The default is "samplesize". Optional column name for gene name. The default is "gene". The default is "id". Minimum allowed p-value. The default is 1e-200. The default is "z". The default is "info_col". The default is "chr_col". The default is "pos". The pval is -log10(P). The default is FALSE.

data frame