RobNorm: Model-Based Robust Normalization Method for Labeled Quantitative Mass Spectrometry Proteomics Data

Meng Wang; Lihua Jiang; Ruiqi Jian; Joanne Chan; Qing Liu; Michael P. Snyder; Hua Tang

doi:10.1101/770115

RobNorm: Model-Based Robust Normalization Method for Labeled Quantitative Mass Spectrometry Proteomics Data

Meng Wang(Stanford University), Lihua Jiang(Stanford University), Ruiqi Jian(Stanford University), Joanne Chan(Stanford University), Qing Liu(Stanford University), Michael P. Snyder(Stanford University), Hua Tang(Stanford University)

bioRxiv (Cold Spring Harbor Laboratory)

September 19, 2019

10.1101/770115

Cited by 3Open Access

Full Text

Abstract

Abstract Motivation Data normalization is an important step in processing proteomics data generated in mass spectrometry (MS) experiments, which aims to reduce sample-level variation and facilitate comparisons of samples. Previously published methods for normalization primarily depend on the assumption that the distribution of protein expression is similar across all samples. However, this assumption fails when the protein expression data is generated from heterogenous samples, such as from various tissue types. This led us to develop a novel data-driven method for improved normalization to correct the systematic bias meanwhile maintaining underlying biological heterogeneity. Methods To robustly correct the systematic bias, we used the density-power-weight method to down-weigh outliers and extended the one-dimensional robust fitting method described in the previous work of (Windham, 1995, Fujisawa and Eguchi, 2008) to our structured data. We then constructed a robustness criterion and developed a new normalization algorithm, called RobNorm. Results In simulation studies and analysis of real data from the genotype-tissue expression (GTEx) project, we compared and evaluated the performance of RobNorm against other normalization methods. We found that the RobNorm approach exhibits the greatest reduction in systematic bias while maintaining across-tissue variation, especially for datasets from highly heterogeneous samples. Availability https://github.com/mwgrassgreen/RobNorm Contact huatang@stanford.edu and mpsnyder@stanford.edu

Related Papers

No related papers found

Powered by citation graph analysis