Reads in outcome data. Checks and organises columns for use with MR or enrichment tests. Infers p-values when possible from beta and se.
Usage
read_outcome_data(
filename,
snps = NULL,
sep = " ",
phenotype_col = "Phenotype",
snp_col = "SNP",
beta_col = "beta",
se_col = "se",
eaf_col = "eaf",
effect_allele_col = "effect_allele",
other_allele_col = "other_allele",
pval_col = "pval",
units_col = "units",
ncase_col = "ncase",
ncontrol_col = "ncontrol",
samplesize_col = "samplesize",
gene_col = "gene",
id_col = "id",
min_pval = 1e-200,
log_pval = FALSE,
chr_col = "chr",
pos_col = "pos"
)Arguments
- filename
Filename. Must have header with at least SNP column present.
- snps
SNPs to extract. If
NULL, which the default, then doesn't extract any and keeps all.- sep
Specify delimiter in file. The default is space, i.e.
sep=" ".- phenotype_col
Optional column name for the column with phenotype name corresponding the the SNP. If not present then will be created with the value
"Outcome". Default is"Phenotype".- snp_col
Required name of column with SNP rs IDs. The default is
"SNP".- beta_col
Required for MR. Name of column with effect sizes. The default is
"beta".- se_col
Required for MR. Name of column with standard errors. The default is
"se".- eaf_col
Required for MR. Name of column with effect allele frequency. The default is
"eaf".- effect_allele_col
Required for MR. Name of column with effect allele. Must be "A", "C", "T" or "G". The default is
"effect_allele".- other_allele_col
Required for MR. Name of column with non effect allele. Must be "A", "C", "T" or "G". The default is
"other_allele".- pval_col
Required for enrichment tests. Name of column with p-value. The default is
"pval".- units_col
Optional column name for units. The default is
"units".- ncase_col
Optional column name for number of cases. The default is
"ncase".- ncontrol_col
Optional column name for number of controls. The default is
"ncontrol".- samplesize_col
Optional column name for sample size. The default is
"samplesize".- gene_col
Optional column name for gene name. The default is
"gene".- id_col
Optional column name to give the dataset an ID. Will be generated automatically if not provided for every trait / unit combination. The default is
"id".- min_pval
Minimum allowed p-value. The default is
1e-200.- log_pval
The pval is -log10(P). The default is
FALSE.- chr_col
Optional column name for chromosome. Default is
"chr".- pos_col
Optional column name for genetic position Default is
"pos".