Date of Award

8-11-2015

Degree Type

Thesis

Degree Name

Master of Public Health (MPH)

Department

Public Health

First Advisor

Ruiyan Luo, Ph.D.

Second Advisor

Shuzaho Li, Ph.D.

Abstract

INTRODUCTION: Transcriptomics and metabolomics are high-throughput technologies that are critical to contemporary biomedical sciences, measuring gene expression levels and metabolite concentrations, respectively. Effective methods of integrating metabolomics and transcriptomics data are highly desired. Gene and metabolic pathways represent accumulated expert knowledge in particular domains. LASSO regression is widely used for feature selection, and group LASSO incorporates prior knowledge of groups of variables.

AIM: To address the current need to integrate the two data types, a novel approach in the framework of group LASSO was developed and tested using a set of metabolomics and transcriptomics data on malaria intermittent preventative treatment with pyrimethamine in Rhesus macaques (Macaca mulatta).

METHODS: Groups are predefined using biological pathways and variables in groups will be standardized separately. The leading principal components were obtained for each pathway for each of the two data types, and then combined into an integrated matrix, which together with the group information served as input for a group LASSO regression model.

RESULTS: We identified multiple pathways that were top contributors to the differences due to pyrimethamine exposure in the macaques and jointly predicted the association of member genes and metabolites to plasma hemoglobin levels.

DISCUSSION: By applying this integration approach via group LASSO, we identified multiple pathways that are top contributors to the differences due to pyrimethamine exposure in the macaques and jointly predicted the association of member genes and metabolites to plasma hemoglobin levels. Our findings are consistent with current literature, and provide high-quality mechanistic hypotheses. Pathway group LASSO is thus a novel and effective method of integrating metabolomics and transcriptomics data.

Share

COinS