Goto

Collaborating Authors

 Materials


Spectroscopy and Chemometrics/Machine-Learning News Weekly #47, 2022 – [:en]NIR Calibration Model[:de]NIR Calibration Model[:it]Modelli di Calibrazione NIR

#artificialintelligence

"Testing two NIRs instruments to predict chicken breast meat quality and exploiting machine learning approaches to discriminate among genotypes and presence of …" LINK "Discrimination of Minced Mutton Adulteration Based on Sized-Adaptive Online NIRS Information and 2D Conventional Neural Network" LINK "Sequential data-fusion of near-infrared and mid-infrared spectroscopy data for improved prediction of quality traits in tuber flours" LINK "End-point determination of the extraction processes for Stevia rebaudiana Bertoni leaves by near-infrared spectroscopy" LINK "Rapid detection of adulteration in powder of ginger (Zingiber officinale Roscoe) by FT-NIR spectroscopy combined with chemometrics" LINK "Extended molar absorption analysis of confined states of water in reverse micelles using near-infrared spectroscopy" LINK "Online quantitative substrate, product, and cell concentration in citric acid fermentation using near-infrared spectroscopy combined with chemometrics" LINK "Discrimination of Chemical Oxygen Demand Pollution in Surface Water Based on Visible Near-Infrared Spectroscopy" LINK "Handheld NIR Spectral Sensor Module Based on a Fully-Integrated Detector Array" LINK "Spectra Transfer based Learning for Predicting and Classifying Soil Texture with Short-Ranged Vis-NIRS Sensor" LINK "AS-polRI: Analysis of apparent spectral polarization radiant intensity in the midwave infrared band for man-made object detection" LINK "Sensors: Single Seed Identification in Three Medicago Species via Multispectral Imaging Combined with Stacking Ensemble Learning" LINK "Transfer Strategy for Near Infrared Analysis Model of Holocellulose and Lignin Based on Improved Slope/Bias Algorithm" LINK "Flipped detection of psychoactive substances in complex mixtures using handheld Raman spectroscopy coupled to chemometrics" LINK "Adaptive Spectral Model for abnormality detection based on physiological status monitoring of dairy cows" LINK "Foods: A Rapid Prediction Method of Moisture Content for Green Tea Fixation Based on WOA-Elman" LINK "Fibers: Numerical Study of Mid-IR Ultrashort Pulse Reconstruction Based on Processing of Spectra Converted in Chalcogenide Fibers with High Kerr Nonlinearity" LINK "Evaluation of a digital micro-mirror device based near-infrared spectrometer for rapid and accurate prediction of quality attributes in poultry feed" LINK "Agriculture: Grazing Intensity Has More Effect on the Potential Nitrification Activity Than the Potential Denitrification Activity in An Alpine Meadow" LINK


Performance Evaluation, Optimization and Dynamic Decision in Blockchain Systems: A Recent Overview

arXiv.org Artificial Intelligence

With rapid development of blockchain technology as well as integration of various application areas, performance evaluation, performance optimization, and dynamic decision in blockchain systems are playing an increasingly important role in developing new blockchain technology. This paper provides a recent systematic overview of this class of research, and especially, developing mathematical modeling and basic theory of blockchain systems. Important examples include (a) performance evaluation: Markov processes, queuing theory, Markov reward processes, random walks, fluid and diffusion approximations, and martingale theory; (b) performance optimization: Linear programming, nonlinear programming, integer programming, and multi-objective programming; (c) optimal control and dynamic decision: Markov decision processes, and stochastic optimal control; and (d) artificial intelligence: Machine learning, deep reinforcement learning, and federated learning. So far, a little research has focused on these research lines. We believe that the basic theory with mathematical methods, algorithms and simulations of blockchain systems discussed in this paper will strongly support future development and continuous innovation of blockchain technology.


Discretized Linear Regression and Multiclass Support Vector Based Air Pollution Forecasting Technique

arXiv.org Artificial Intelligence

Air pollution is a vital issue emerging from the uncontrolled utilization of traditional energy sources as far as developing countries are concerned. Hence, ingenious air pollution forecasting methods are indispensable to minimize the risk. To that end, this paper proposes an Internet of Things (IoT) enabled system for monitoring and controlling air pollution in the cloud computing environment. A method called Linear Regression and Multiclass Support Vector (LR-MSV) IoT-based Air Pollution Forecast is proposed to monitor the air quality data and the air quality index measurement to pave the way for controlling effectively. Extensive experiments carried out on the air quality data in the India dataset have revealed the outstanding performance of the proposed LR-MSV method when benchmarked with well-established state-of-the-art methods. The results obtained by the LR-MSV method witness a significant increase in air pollution forecasting accuracy by reducing the air pollution forecasting time and error rate compared with the results produced by the other state-of-the-art methods


Composition based oxidation state prediction of materials using deep learning

arXiv.org Artificial Intelligence

Oxidation states are the charges of atoms after their ionic approximation of their bonds, which have been widely used in charge-neutrality verification, crystal structure determination, and reaction estimation. Currently only heuristic rules exist for guessing the oxidation states of a given compound with many exceptions. Recent work has developed machine learning models based on heuristic structural features for predicting the oxidation states of metal ions. However, composition based oxidation state prediction still remains elusive so far, which is more important in new material discovery for which the structures are not even available. This work proposes a novel deep learning based BERT transformer language model BERTOS for predicting the oxidation states of all elements of inorganic compounds given only their chemical composition. Oxidation states (OS) are the charges of atoms after their ionic approximation of their bonds, which are the fundamental attributes of elements that help to explain redox reactions, reactivity, chemical bonding, and chemical properties of different elements and compounds. In electrochemistry, oxidation states are used to represent relevant compounds and ions in Latimer and Frost diagrams, and they can also be used to calculate the charge neutrality of chemical compounds to screen potential hypothetical materials generated by computational design algorithms. Oxidation states have also been used to study the complexes of transition metals.


Identifying Chemicals Through Dimensionality Reduction

arXiv.org Artificial Intelligence

Civilizations have tried to make drinking water safe to consume for thousands of years. The process of determining water contaminants has evolved with the complexity of the contaminants due to pesticides and heavy metals. The routine procedure to determine water safety is to use targeted analysis which searches for specific substances from some known list; however, we do not explicitly know which substances should be on this list. Before experimentally determining which substances are contaminants, how do we answer the sampling problem of identifying all the substances in the water? Here, we present an approach that builds on the work of Jaanus Liigand et al., which used non-targeted analysis that conducts a broader search on the sample to develop a random-forest regression model, to predict the names of all the substances in a sample, as well as their respective concentrations[1]. This work utilizes techniques from dimensionality reduction and linear decompositions to present a more accurate model using data from the European Massbank Metabolome Library to produce a global list of chemicals that researchers can then identify and test for when purifying water.


Neural Graph Databases

arXiv.org Artificial Intelligence

Graph databases (GDBs) enable processing and analysis of unstructured, complex, rich, and usually vast graph datasets. Despite the large significance of GDBs in both academia and industry, little effort has been made into integrating them with the predictive power of graph neural networks (GNNs). In this work, we show how to seamlessly combine nearly any GNN model with the computational capabilities of GDBs. For this, we observe that the majority of these systems are based on, or support, a graph data model called the Labeled Property Graph (LPG), where vertices and edges can have arbitrarily complex sets of labels and properties. We then develop LPG2vec, an encoder that transforms an arbitrary LPG dataset into a representation that can be directly used with a broad class of GNNs, including convolutional, attentional, message-passing, and even higher-order or spectral models. In our evaluation, we show that the rich information represented as LPG labels and properties is properly preserved by LPG2vec, and it increases the accuracy of predictions regardless of the targeted learning task or the used GNN model, by up to 34% compared to graphs with no LPG labels/properties. In general, LPG2vec enables combining predictive power of the most powerful GNNs with the full scope of information encoded in the LPG model, paving the way for neural graph databases, a class of systems where the vast complexity of maintained data will benefit from modern and future graph machine learning methods.


Spherical Message Passing for 3D Graph Networks

arXiv.org Artificial Intelligence

We consider representation learning of 3D molecular graphs in which each atom is associated with a spatial position in 3D. This is an under-explored area of research, and a principled message passing framework is currently lacking. In this work, we conduct analyses in the spherical coordinate system (SCS) for the complete identification of 3D graph structures. Based on such observations, we propose the spherical message passing (SMP) as a novel and powerful scheme for 3D molecular learning. SMP dramatically reduces training complexity, enabling it to perform efficiently on large-scale molecules. In addition, SMP is capable of distinguishing almost all molecular structures, and the uncovered cases may not exist in practice. Based on meaningful physically-based representations of 3D information, we further propose the SphereNet for 3D molecular learning. Experimental results demonstrate that the use of meaningful 3D information in SphereNet leads to significant performance improvements in prediction tasks. Our results also demonstrate the advantages of SphereNet in terms of capability, efficiency, and scalability. Our code is publicly available as part of the DIG library (https://github.com/divelab/DIG).


The Design Space of E(3)-Equivariant Atom-Centered Interatomic Potentials

arXiv.org Artificial Intelligence

The rapid progress of machine learning interatomic potentials over the past couple of years produced a number of new architectures. Particularly notable among these are the Atomic Cluster Expansion (ACE), which unified many of the earlier ideas around atom density-based descriptors, and Neural Equivariant Interatomic Potentials (NequIP), a message passing neural network with equivariant features that showed state of the art accuracy. In this work, we construct a mathematical framework that unifies these models: ACE is generalised so that it can be recast as one layer of a multi-layer architecture. From another point of view, the linearised version of NequIP is understood as a particular sparsification of a much larger polynomial model. Our framework also provides a practical tool for systematically probing different choices in the unified design space. We demonstrate this by an ablation study of NequIP via a set of experiments looking at in- and out-of-domain accuracy and smooth extrapolation very far from the training data, and shed some light on which design choices are critical for achieving high accuracy. Finally, we present BOTNet (Body-Ordered-Tensor-Network), a much-simplified version of NequIP, which has an interpretable architecture and maintains accuracy on benchmark datasets.


Learning Regularized Positional Encoding for Molecular Prediction

arXiv.org Artificial Intelligence

Machine learning has become a promising approach for molecular modeling. Positional quantities, such as interatomic distances and bond angles, play a crucial role in molecule physics. The existing works rely on careful manual design of their representation. To model the complex nonlinearity in predicting molecular properties in an more end-to-end approach, we propose to encode the positional quantities with a learnable embedding that is continuous and differentiable. A regularization technique is employed to encourage embedding smoothness along the physical dimension. We experiment with a variety of molecular property and force field prediction tasks. Improved performance is observed for three different model architectures after plugging in the proposed positional encoding method. In addition, the learned positional encoding allows easier physics-based interpretation. We observe that tasks of similar physics have the similar learned positional encoding.


Automatic extraction of materials and properties from superconductors scientific literature

arXiv.org Artificial Intelligence

The automatic extraction of materials and related properties from the scientific literature is gaining attention in data-driven materials science (Materials Informatics). In this paper, we discuss Grobid-superconductors, our solution for automatically extracting superconductor material names and respective properties from text. Built as a Grobid module, it combines machine learning and heuristic approaches in a multi-step architecture that supports input data as raw text or PDF documents. Using Grobid-superconductors, we built SuperCon2, a database of 40324 materials and properties records from 37700 papers. The material (or sample) information is represented by name, chemical formula, and material class, and is characterized by shape, doping, substitution variables for components, and substrate as adjoined information. The properties include the Tc superconducting critical temperature and, when available, applied pressure with the Tc measurement method.