Skip to contents

Metabolomics & proteomics data preparation and quality control pipeline for R

Overview

omiprep supports the full data-preparation workflow for untargeted and targeted omics data:

  1. Import raw data from Metabolon, Nightingale Health, Olink, and SomaLogic platforms (Excel / flat-text)
  2. Summarise sample- and feature-level statistics
  3. Filter using a standard QC pipeline with user-defined thresholds
  4. Report results as an interactive HTML or PDF document
  5. Export cleaned data for downstream analysis

Installation

# install.packages("pak")
pak::pak("MRCIEU/omiprep")

Quick start

library(omiprep)

# 1. Read data
mydata <- read_metabolon(
  system.file("extdata", "metabolon_v1.1_example.xlsx", package = "omiprep"),
  sheet             = "OrigScale",
  return_Omiprep = TRUE
)

# 2. Run QC pipeline
mydata <- mydata |> quality_control(
  source_layer        = "input",
  sample_missingness  = 0.2,
  feature_missingness = 0.2,
  total_peak_area_sd  = 5,
  outlier_udist       = 5,
  outlier_treatment   = "leave_be"
)

# 3. Summarise
summary(mydata)

# 4. Generate HTML report
generate_report(mydata, output_dir = ".")

Articles

Importing Data

Metabolon

Import untargeted metabolomics data from Metabolon Excel sheets.

Nightingale Health

Import NMR-based metabolomic data from Nightingale Health.

Olink

Import proximity extension assay proteomic data from Olink.

SomaLogic

Import aptamer-based proteomic data from SomaLogic SomaScan.

Summaries & QC

Sample Summary

Compute per-sample statistics: missingness, total peak area, and PCA-based outlier detection.

Feature Summary

Compute per-feature statistics: missingness, variance, and independent feature trees.

QC Pipeline

Run the full quality control pipeline with configurable thresholds for missingness, outliers, and more.

Reports & Export

Generate HTML / PDF Report

Produce a fully annotated, interactive QC report in HTML or PDF format.

Export Data

Export processed data and summary tables to Excel or tab-delimited flat files.

Batch Normalisation

Correct for run-order and batch effects using quantile or rank-based normalisation.