Go to main content
Formats
Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Renal cell carcinoma (RCC) is one of the deadliest urinary cancer today. Detecting and staging renal cell carcinoma involves expensive imaging tests and biopsy, which is invasive and can be riddled with sampling errors. Alternative, non-invasive, cost-effective diagnostic methods will significantly reduce the burden of RCC in the world. Given that metabolic rewiring is required for the onset and progression of RCC and the proximity of urine with the kidney, I set out to discover a urinary metabolic biomarker for RCC using advances in machine learning. Metabolomics is the study of small molecules in biological samples – and as the apogee of the omics trilogy, it is the closest to an organism phenotype. Untargeted metabolomics affords the unfiltered detection of metabolites. In this dissertation, liquid chromatography-mass spectrometry (LC-MS) and nuclear magnetic spectroscopy (NMR) were used for untargeted metabolite profiling for broad analyte coverage. I employed the use of machine learning to mine the metabolomics data generated. Machine learning (ML) is a type of artificial intelligence that entails computational techniques for learning patterns in a complex dataset. I conducted three categories of ML tasks in the dissertation – binary classification, regression, and ML model interpretations. All data modalities are tabular. Using untargeted metabolomics and ML, I utilized human urine samples to discriminate between healthy controls and RCC to identify biomarkers that can be used for RCC detection. In addition, I predicted RCC primary tumor sizes using selected urinary metabolites, as well as the discrimination of early-stage RCC from advanced-stage RCC. Furthermore, I introduced a start-of-the-art interpretable machine learning (IML) technique called Shapley Additive Explanations (SHAP). SHAP was used to explain ML models developed for publicly available clinical metabolomics dataset – and also to explain ML models for RCC detection. These studies led to the accurate detection and staging of RCC in the study cohort and the identification of some novel metabolic markers. In addition, the ML methods presented in the thesis can be used to advance biomarker discoveries in other omics fields.

Details

PDF

Statistics

from
to
Export
Download Full History