Considerations of automated machine learning in clinical metabolic profiling: Altered homocysteine plasma concentration associated with metformin exposure

arXiv.org Machine Learning

With the maturation of metabolomics science and proliferation of biobanks, clinical metabolic profiling is an increasingly opportunistic frontier for advancing translational clinical research. Automated Machine Learning (AutoML) approaches provide exciting opportunity to guide feature selection in agnostic metabolic profiling endeavors, where potentially thousands of independent data points must be evaluated. In previous research, AutoML using high-dimensional data of varying types has been demonstrably robust, outperforming traditional approaches. However, considerations for application in clinical metabolic profiling remain to be evaluated. Particularly, regarding the robustness of AutoML to identify and adjust for common clinical confounders. In this study, we present a focused case study regarding AutoML considerations for using the Tree-Based Optimization Tool (TPOT) in metabolic profiling of exposure to metformin in a biobank cohort. First, we propose a tandem rank-accuracy measure to guide agnostic feature selection and corresponding threshold determination in clinical metabolic profiling endeavors. Second, while AutoML, using default parameters, demonstrated potential to lack sensitivity to low-effect confounding clinical covariates, we demonstrated residual training and adjustment of metabolite features as an easily applicable approach to ensure AutoML adjustment for potential confounding characteristics. Finally, we present increased homocysteine with long-term exposure to metformin as a potentially novel, non-replicated metabolite association suggested by TPOT; an association not identified in parallel clinical metabolic profiling endeavors. While considerations are recommended, including adjustment approaches for clinical confounders, AutoML presents an exciting tool to enhance clinical metabolic profiling and advance translational research endeavors.


The long reach of the gut

Science

How does the gut microbiota shape the composition and function of distal host organs, despite being segregated in the gut? Uchimura et al. used stable isotope tracing to show that microbial metabolites penetrate host tissues and fluids to influence host immunological and metabolic signaling networks. However, metabolite impact is modulated by a high rate of urinary excretion of microbial products. Furthermore, secretory immunoglobulin A antibodies limit bacterial dwell times in the small intestine, which also ameliorates host exposure to microbial metabolites. The joint effect contributes to resolving gut function as both nutrient gateway and barrier.


Plant nutrient acquisition entices herbivore

Science

To survive in highly complex environments, plants universally rely on specialized, or secondary, metabolites to withstand abiotic challenges (for example, wax to limit transpiration) and biotic challenges (for example, glucosinolates to deter herbivores). These metabolites are lineage-specific, and functional studies usually consider them to have a singular function. However, the complexity of the environment is much larger than the number of secondary metabolites within a plant, indicating that individual specialized metabolites may need to have multiple roles. As the number of functions of a single metabolite increases, so does the number of proteins and processes affected, and there is no guarantee that all of these interactions are positive. On page 694 of this issue, Hu et al. (1) show that benzoxazinoids, a textbook example of specialized metabolites in maize, have a functional duality centered around iron acquisition.


Lysosomal metabolomics reveals V-ATPase- and mTOR-dependent regulation of amino acid efflux from lysosomes

Science

The lysosome degrades and recycles macromolecules, signals to the cytosol and nucleus, and is implicated in many diseases. Here, we describe a method for the rapid isolation of mammalian lysosomes and use it to quantitatively profile lysosomal metabolites under various cell states. Under nutrient-replete conditions, many lysosomal amino acids are in rapid exchange with those in the cytosol. Loss of lysosomal acidification through inhibition of the vacuolar H –adenosine triphosphatase (V-ATPase) increased the luminal concentrations of most metabolites but had no effect on those of the majority of essential amino acids. Instead, nutrient starvation regulates the lysosomal concentrations of these amino acids, an effect we traced to regulation of the mechanistic target of rapamycin (mTOR) pathway.


Accurate, fully-automated NMR spectral profiling for metabolomics

arXiv.org Artificial Intelligence

Many diseases cause significant changes to the concentrations of small molecules (aka metabolites) that appear in a person's biofluids, which means such diseases can often be readily detected from a person's "metabolic profile". This information can be extracted from a biofluid's NMR spectrum. Today, this is often done manually by trained human experts, which means this process is relatively slow, expensive and error-prone. This paper presents a tool, Bayesil, that can quickly, accurately and autonomously produce a complex biofluid's (e.g., serum or CSF) metabolic profile from a 1D1H NMR spectrum. This requires first performing several spectral processing steps then matching the resulting spectrum against a reference compound library, which contains the "signatures" of each relevant metabolite. Many of these steps are novel algorithms and our matching step views spectral matching as an inference problem within a probabilistic graphical model that rapidly approximates the most probable metabolic profile. Our extensive studies on a diverse set of complex mixtures, show that Bayesil can autonomously find the concentration of all NMR-detectable metabolites accurately (~90% correct identification and ~10% quantification error), in <5minutes on a single CPU. These results demonstrate that Bayesil is the first fully-automatic publicly-accessible system that provides quantitative NMR spectral profiling effectively -- with an accuracy that meets or exceeds the performance of trained experts. We anticipate this tool will usher in high-throughput metabolomics and enable a wealth of new applications of NMR in clinical settings. Available at http://www.bayesil.ca.