Materials
A new Reinforcement Learning framework to discover natural flavor molecules
Queiroz, Luana P., Rebello, Carine M., Costa, Erbet A., Santana, Vinícius V., Rodrigues, Bruno C. L., Rodrigues, Alírio E., Ribeiro, Ana M., Nogueira, Idelfonso B. R.
The flavor is the focal point in the flavor industry, which follows social tendencies and behaviors. The research and development of new flavoring agents and molecules are essential in this field. On the other hand, the development of natural flavors plays a critical role in modern society. In light of this, the present work proposes a novel framework based on Scientific Machine Learning to undertake an emerging problem in flavor engineering and industry. Therefore, this work brings an innovative methodology to design new natural flavor molecules. The molecules are evaluated regarding the synthetic accessibility, the number of atoms, and the likeness to a natural or pseudo-natural product.
Spectroscopy and Chemometrics + Machine-Learning News Weekly #36, 2022
NIR Calibration-Model Services Services for Professional Development of NIRS Calibrations NIR Near-Infrared-Spectroscopy QA QC QAQC Laboratory LINK Spectroscopy and Chemometrics News Weekly 35, 2022 NIRS NIR Spectroscopy MachineLearning Spectrometer Spectrometric Analytical Chemistry Chemical Analysis Lab Labs Laboratories Laboratory Software IoT Sensors QA QC Testing Quality LINK Near-Infrared Spectroscopy (NIRS) "Comparing Calibration Algorithms for the Rapid Characterization of Pretreated Corn Stover Using Near-Infrared Spectroscopy" LINK "Indirect Measurement of -Glucan Content in Barley Grain with Near-Infrared Reflectance Spectroscopy" LINK "Foods: Markov Transition Field Combined with Convolutional Neural Network Improved the Predictive Performance of Near-Infrared Spectroscopy Models for Determination of Aflatoxin B1 in Maize" LINK "Determination of Fruit Freshness Using Near-Infrared Spectroscopy and Machine Learning Techniques" LINK "Extensive evaluation of prediction ...
The US doesn't know where its critical minerals are. AI could help find them.
The energy transition requires critical minerals. Though the U.S. has plentiful resources of its own, the country has largely relied on foreign sources. That's in part because one major roadblock to accessing American critical mineral deposits is that they remain largely unmapped. That may be about to change, though. The Department of Defense and the U.S. Geological Survey have issued two separate challenges to explore using artificial intelligence and machine learning to expedite USGS' task of assessing the availability and mining potential of 50 critical minerals.
Explaining Results of Multi-Criteria Decision Making
Erwig, Martin, Kumar, Prashant
We introduce a method for explaining the results of various linear and hierarchical multi-criteria decision-making (MCDM) techniques such as WSM and AHP. The two key ideas are (A) to maintain a fine-grained representation of the values manipulated by these techniques and (B) to derive explanations from these representations through merging, filtering, and aggregating operations. An explanation in our model presents a high-level comparison of two alternatives in an MCDM problem, presumably an optimal and a non-optimal one, illuminating why one alternative was preferred over the other one. We show the usefulness of our techniques by generating explanations for two well-known examples from the MCDM literature. Finally, we show their efficacy by performing computational experiments.
Predict+Optimize for Packing and Covering LPs with Unknown Parameters in Constraints
Hu, Xinyi, Lee, Jasper C. H., Lee, Jimmy H. M.
Predict+Optimize is a recently proposed framework which combines machine learning and constrained optimization, tackling optimization problems that contain parameters that are unknown at solving time. The goal is to predict the unknown parameters and use the estimates to solve for an estimated optimal solution to the optimization problem. However, all prior works have focused on the case where unknown parameters appear only in the optimization objective and not the constraints, for the simple reason that if the constraints were not known exactly, the estimated optimal solution might not even be feasible under the true parameters. The contributions of this paper are two-fold. First, we propose a novel and practically relevant framework for the Predict+Optimize setting, but with unknown parameters in both the objective and the constraints. We introduce the notion of a correction function, and an additional penalty term in the loss function, modelling practical scenarios where an estimated optimal solution can be modified into a feasible solution after the true parameters are revealed, but at an additional cost. Second, we propose a corresponding algorithmic approach for our framework, which handles all packing and covering linear programs. Our approach is inspired by the prior work of Mandi and Guns, though with crucial modifications and re-derivations for our very different setting. Experimentation demonstrates the superior empirical performance of our method over classical approaches.
Analysis and Evaluation of Synchronous and Asynchronous FLchain
Wilhelmi, Francesc, Giupponi, Lorenza, Dini, Paolo
Motivated by the heterogeneous nature of devices participating in large-scale Federated Learning (FL) optimization, we focus on an asynchronous server-less FL solution empowered by blockchain technology. In contrast to mostly adopted FL approaches, which assume synchronous operation, we advocate an asynchronous method whereby model aggregation is done as clients submit their local updates. The asynchronous setting fits well with the federated optimization idea in practical large-scale settings with heterogeneous clients. Thus, it potentially leads to higher efficiency in terms of communication overhead and idle periods. To evaluate the learning completion delay of BC-enabled FL, we provide an analytical model based on batch service queue theory. Furthermore, we provide simulation results to assess the performance of both synchronous and asynchronous mechanisms. Important aspects involved in the BC-enabled FL optimization, such as the network size, link capacity, or user requirements, are put together and analyzed. As our results show, the synchronous setting leads to higher prediction accuracy than the asynchronous case. Nevertheless, asynchronous federated optimization provides much lower latency in many cases, thus becoming an appealing solution for FL when dealing with large datasets, tough timing constraints (e.g., near-real-time applications), or highly varying training data.
Deep autoencoders for physics-constrained data-driven nonlinear materials modeling
He, Xiaolong, He, Qizhi, Chen, Jiun-Shyan
Physics-constrained data-driven computing is an emerging computational paradigm that allows simulation of complex materials directly based on material database and bypass the classical constitutive model construction. However, it remains difficult to deal with high-dimensional applications and extrapolative generalization. This paper introduces deep learning techniques under the data-driven framework to address these fundamental issues in nonlinear materials modeling. To this end, an autoencoder neural network architecture is introduced to learn the underlying low-dimensional representation (embedding) of the given material database. The offline trained autoencoder and the discovered embedding space are then incorporated in the online data-driven computation such that the search of optimal material state from database can be performed on a low-dimensional space, aiming to enhance the robustness and predictability with projected material data. To ensure numerical stability and representative constitutive manifold, a convexity-preserving interpolation scheme tailored to the proposed autoencoder-based data-driven solver is proposed for constructing the material state. In this study, the applicability of the proposed approach is demonstrated by modeling nonlinear biological tissues. A parametric study on data noise, data size and sparsity, training initialization, and model architectures, is also conducted to examine the robustness and convergence property of the proposed approach.
Inference and dynamic decision-making for deteriorating systems with probabilistic dependencies through Bayesian networks and deep reinforcement learning
Morato, Pablo G., Andriotis, Charalampos P., Papakonstantinou, Konstantinos G., Rigo, Philippe
In the context of modern environmental and societal concerns, there is an increasing demand for methods able to identify management strategies for civil engineering systems, minimizing structural failure risks while optimally planning inspection and maintenance (I&M) processes. Most available methods simplify the I&M decision problem to the component level due to the computational complexity associated with global optimization methodologies under joint system-level state descriptions. In this paper, we propose an efficient algorithmic framework for inference and decision-making under uncertainty for engineering systems exposed to deteriorating environments, providing optimal management strategies directly at the system level. In our approach, the decision problem is formulated as a factored partially observable Markov decision process, whose dynamics are encoded in Bayesian network conditional structures. The methodology can handle environments under equal or general, unequal deterioration correlations among components, through Gaussian hierarchical structures and dynamic Bayesian networks. In terms of policy optimization, we adopt a deep decentralized multi-agent actor-critic (DDMAC) reinforcement learning approach, in which the policies are approximated by actor neural networks guided by a critic network. By including deterioration dependence in the simulated environment, and by formulating the cost model at the system level, DDMAC policies intrinsically consider the underlying system-effects. This is demonstrated through numerical experiments conducted for both a 9-out-of-10 system and a steel frame under fatigue deterioration. Results demonstrate that DDMAC policies offer substantial benefits when compared to state-of-the-art heuristic approaches. The inherent consideration of system-effects by DDMAC strategies is also interpreted based on the learned policies.
A Framework for Extracting and Encoding Features from Object-Centric Event Data
Adams, Jan Niklas, Park, Gyunam, Levich, Sergej, Schuster, Daniel, van der Aalst, Wil M. P.
Traditional process mining techniques take event data as input where each event is associated with exactly one object. An object represents the instantiation of a process. Object-centric event data contain events associated with multiple objects expressing the interaction of multiple processes. As traditional process mining techniques assume events associated with exactly one object, these techniques cannot be applied to object-centric event data. To use traditional process mining techniques, the object-centric event data are flattened by removing all object references but one. The flattening process is lossy, leading to inaccurate features extracted from flattened data. Furthermore, the graph-like structure of object-centric event data is lost when flattening. In this paper, we introduce a general framework for extracting and encoding features from object-centric event data. We calculate features natively on the object-centric event data, leading to accurate measures. Furthermore, we provide three encodings for these features: tabular, sequential, and graph-based. While tabular and sequential encodings have been heavily used in process mining, the graph-based encoding is a new technique preserving the structure of the object-centric event data. We provide six use cases: a visualization and a prediction use case for each of the three encodings. We use explainable AI in the prediction use cases to show the utility of both the object-centric features and the structure of the sequential and graph-based encoding for a predictive model.
Efficient Chemical Space Exploration Using Active Learning Based on Marginalized Graph Kernel: an Application for Predicting the Thermodynamic Properties of Alkanes with Molecular Simulation
Xiang, Yan, Tang, Yu-Hang, Gong, Zheng, Liu, Hongyi, Wu, Liang, Lin, Guang, Sun, Huai
We introduce an explorative active learning (AL) algorithm based on Gaussian process regression and marginalized graph kernel (GPR-MGK) to explore chemical space with minimum cost. Using high-throughput molecular dynamics simulation to generate data and graph neural network (GNN) to predict, we constructed an active learning molecular simulation framework for thermodynamic property prediction. In specific, targeting 251,728 alkane molecules consisting of 4 to 19 carbon atoms and their liquid physical properties: densities, heat capacities, and vaporization enthalpies, we use the AL algorithm to select the most informative molecules to represent the chemical space. Validation of computational and experimental test sets shows that only 313 (0.124\% of the total) molecules were sufficient to train an accurate GNN model with $\rm R^2 > 0.99$ for computational test sets and $\rm R^2 > 0.94$ for experimental test sets. We highlight two advantages of the presented AL algorithm: compatibility with high-throughput data generation and reliable uncertainty quantification.