Files
Abstract
Living organisms have a biochemical network that responds perturbations, including food intake, environmental changes, and interactions with other organisms. Knowledge of regulation and dynamics in biological networks is crucial for understanding disease mechanisms and optimizing industrial fermentation. Metabolomics techniques can profile the metabolome and the biochemical network. However, most current research focuses on a single time point without capturing its complete dynamics and researchers often miss considerable proportions of the curated network. This is mainly because of difficulties in collecting time-series data and the lack of computational approaches for high-dimensional dynamics. This leaves us with an incomplete understanding of the network and its role in disease.Here, I present a workflow to extract biological and chemical knowledge from the dynamic in vivo metabolome. The workflow is composed of experimental profiling of metabolic dynamics, feature extraction of complex data, and knowledge discovery from the time series. First, continuous in vivo metabolism by nuclear magnetic resonance (CIVM-NMR) was built to record the in vivo metabolome through time in multiple organisms, particularly Neurospora crassa. After perturbations in oxygen levels and carbon sources, different response profiles were recorded and analyzed. From CIVM-NMR, we often collect multiple perturbation datasets, where each is composed of multiple spectra, and each spectrum at one time point has hundreds of peaks, thus producing a complex data structure. Second, I designed several computational approaches to extract features from complex datasets. I built Ridge Tracking-based Extract (RTExtract) to extract NMR features from the time-series spectra, even in cases of highly overlapped and crossing peaks. To improve accuracy in overlapping regions and promote automation, I also built spectral automatic NMR decomposition (SAND), which automates preprocessing and decomposes NMR spectra. With RTExtract and SAND, a complex NMR dataset can be reduced to a table of peaks at different time points. Third, I extracted biochemical knowledge regarding the in vivo metabolome from this high-dimensional time-series dataset by dimensionality reduction and network construction. An end-to-end workflow was built to extract knowledge from in vivo perturbed systems and the workflow will be applied to broader time-series studies, particularly in precision medicine.