Goto

Collaborating Authors

 Materials


Machine learning for the prediction of safe and biologically active organophosphorus molecules

arXiv.org Artificial Intelligence

Drug discovery is a complex process with a large molecular space to be considered. By constraining the search space, the fragment-based drug design is an approach that can effectively sample the chemical space of interest. Here we propose a framework of Recurrent Neural Networks (RNN) with an attention model to sample the chemical space of organophosphorus molecules using the fragment-based approach. The framework is trained with a ZINC dataset that is screened for high druglikeness scores. The goal is to predict molecules with similar biological action modes as organophosphorus pesticides or chemical warfare agents yet less toxic to humans. The generated molecules contain a starting fragment of PO2F but have a bulky hydrocarbon side chain limiting its binding effectiveness to the targeted protein.


A Statistically-Based Approach to Feedforward Neural Network Model Selection

arXiv.org Artificial Intelligence

Feedforward neural networks (FNNs) can be viewed as non-linear regression models, where covariates enter the model through a combination of weighted summations and non-linear functions. Although these models have some similarities to the models typically used in statistical modelling, the majority of neural network research has been conducted outside of the field of statistics. This has resulted in a lack of statistically-based methodology, and, in particular, there has been little emphasis on model parsimony. Determining the input layer structure is analogous to variable selection, while the structure for the hidden layer relates to model complexity. In practice, neural network model selection is often carried out by comparing models using out-of-sample performance. However, in contrast, the construction of an associated likelihood function opens the door to information-criteria-based variable and architecture selection. A novel model selection method, which performs both input- and hidden-node selection, is proposed using the Bayesian information criterion (BIC) for FNNs. The choice of BIC over out-of-sample performance as the model selection objective function leads to an increased probability of recovering the true model, while parsimoniously achieving favourable out-of-sample performance. Simulation studies are used to evaluate and justify the proposed method, and applications on real data are investigated.


The Future of Mining: How Technology is Driving Increased Productivity - Isrg KB

#artificialintelligence

Mining is an important sector in India, contributing to the country's economic growth. The Indian mining industry encompasses exploration, extraction, and processing of minerals and metallic and non-metallic minerals. With increasing demand for minerals, the mining industry in India has seen significant growth in recent years. In this report, we will analyze the current state of mining technology in India and its impact on the mining industry. The mining industry in India has made significant advancements in recent years, with the introduction of modern technology.


Rapid Design of Top-Performing Metal-Organic Frameworks with Qualitative Representations of Building Blocks

arXiv.org Artificial Intelligence

Data-driven materials design often encounters challenges where systems require or possess qualitative (categorical) information. Metal-organic frameworks (MOFs) are an example of such material systems. The representation of MOFs through different building blocks makes it a challenge for designers to incorporate qualitative information into design optimization. Furthermore, the large number of potential building blocks leads to a combinatorial challenge, with millions of possible MOFs that could be explored through time consuming physics-based approaches. In this work, we integrated Latent Variable Gaussian Process (LVGP) and Multi-Objective Batch-Bayesian Optimization (MOBBO) to identify top-performing MOFs adaptively, autonomously, and efficiently without any human intervention. Our approach provides three main advantages: (i) no specific physical descriptors are required and only building blocks that construct the MOFs are used in global optimization through qualitative representations, (ii) the method is application and property independent, and (iii) the latent variable approach provides an interpretable model of qualitative building blocks with physical justification. To demonstrate the effectiveness of our method, we considered a design space with more than 47,000 MOF candidates. By searching only ~1% of the design space, LVGP-MOBBO was able to identify all MOFs on the Pareto front and more than 97% of the 50 top-performing designs for the CO$_2$ working capacity and CO$_2$/N$_2$ selectivity properties. Finally, we compared our approach with the Random Forest algorithm and demonstrated its efficiency, interpretability, and robustness.


Topology-Controlled Self-Assembly of Amphiphilic Block Copolymers

#artificialintelligence

Contemporary synthetic chemistry approaches can be used to yield a range of distinct polymer topologies with precise control. The topology of a polymer strongly influences its self-assembly into complex nanostructures however a clear mechanistic understanding of the relationship between polymer topology and self-assembly has not yet been developed. In this work, we use atomistic molecular dynamics simulations to provide a nanoscale picture of the self-assembly of three poly(ethylene oxide)-poly(methyl acrylate) block copolymers with different topologies into micelles. We find that the topology affects the ability of the micelle to form a compact hydrophobic core, which directly affects its stability. Also, we apply unsupervised machine learning techniques to show that the topology of a polymer affects its ability to take a conformation in response to the local environment within the micelles.


Spectroscopy and Chemometrics/Machine-Learning News Weekly #6, 2023 โ€“ [:en]NIR Calibration Model[:de]NIR Calibration Model[:it]Modelli di Calibrazione NIR

#artificialintelligence

Get the Spectroscopy and Chemometrics News Weekly in real time on Twitter @ CalibModel and follow us. "Component Prediction of Antai Pills Based on One-Dimensional Convolutional Neural Network and Near-Infrared Spectroscopy" LINK "Moisture content monitoring in withering leaves during black tea processing based on electronic eye and near infrared spectroscopy" LINK "Hyperspectral technique combined with stacking and blending ensemble learning method for detection of cadmium content in oilseed rape leaves" LINK "Capacitance spectroscopy enables realtime monitoring of early cell death in mammalian cell culture" LINK "Detection of bruised loquats based on reflectance, absorbance and Kubelka-Munk spectra" LINK "Longitudinal alterations of pulmonary [โ€ฆ formulaโ€ฆ] O2 on-kinetics during moderate-intensity exercise in competitive youth cyclists are related to alterations in the โ€ฆ" LINK


Paraphrase Acquisition from Image Captions

arXiv.org Artificial Intelligence

We propose to use image captions from the Web as a previously underutilized resource for paraphrases (i.e., texts with the same "message") and to create and analyze a corresponding dataset. When an image is reused on the Web, an original caption is often assigned. We hypothesize that different captions for the same image naturally form a set of mutual paraphrases. To demonstrate the suitability of this idea, we analyze captions in the English Wikipedia, where editors frequently relabel the same image for different articles. The paper introduces the underlying mining technology, the resulting Wikipedia-IPC dataset, and compares known paraphrase corpora with respect to their syntactic and semantic paraphrase similarity to our new resource. In this context, we introduce characteristic maps along the two similarity dimensions to identify the style of paraphrases coming from different sources. An annotation study demonstrates the high reliability of the algorithmically determined characteristic maps.


Design of Bistable Soft Deployable Structures via a Kirigami-inspired Planar Fabrication Approach

arXiv.org Artificial Intelligence

Fully soft bistable mechanisms have shown extensive applications ranging from soft robotics, wearable devices, and medical tools, to energy harvesting. However, the lack of design and fabrication methods that are easy and potentially scalable limits their further adoption into mainstream applications. Here a top-down planar approach is presented by introducing Kirigami-inspired engineering combined with a pre-stretching process. Using this method, Kirigami-Pre-stretched Substrate-Kirigami trilayered precursors are created in a planar manner; upon release, the strain mismatch -- due to the pre-stretching of substrate -- between layers would induce an out-of-plane buckling to achieve targeted three dimensional (3D) bistable structures. By combining experimental characterization, analytical modeling, and finite element simulation, the effect of the pattern size of Kirigami layers and pre-stretching on the geometry and stability of resulting 3D composites is explored. In addition, methods to realize soft bistable structures with arbitrary shapes and soft composites with multistable configurations are investigated, which could encourage further applications. Our method is demonstrated by using bistable soft Kirigami composites to construct two soft machines: (i) a bistable soft gripper that can gently grasp delicate objects with different shapes and sizes and (ii) a flytrap-inspired robot that can autonomously detect and capture objects.


Behind the glory: the dark sides of AI models that big tech willโ€ฆ โ€“ Towards AI

#artificialintelligence

Originally published on Towards AI. With ChatGPT blowing the internet, we are at a critical juncture that demands us to again ask hard questions about the impact of AI models on society, a conversation that starts but never ends. In this article, I aim to bring attention to the importance of knowing that, even though large AI models are impressive, there are often unacknowledged costs behind them. It is like saying " data is the new oil" to describe its value, but this analogy often ignores the costs of the oil and mining industries. To understand what AI is made from, we need to leave Silicon Valley and go to the place where the stuff for the AI industry is made. The term "artificial intelligence" may evoke the ideas of algorithms and data, but it is powered by the rare earth's minerals and resources that make up the computing components [1].


An Order-Invariant and Interpretable Hierarchical Dilated Convolution Neural Network for Chemical Fault Detection and Diagnosis

arXiv.org Artificial Intelligence

Fault detection and diagnosis is significant for reducing maintenance costs and improving health and safety in chemical processes. Convolution neural network (CNN) is a popular deep learning algorithm with many successful applications in chemical fault detection and diagnosis tasks. However, convolution layers in CNN are very sensitive to the order of features, which can lead to instability in the processing of tabular data. Optimal order of features result in better performance of CNN models but it is expensive to seek such optimal order. In addition, because of the encapsulation mechanism of feature extraction, most CNN models are opaque and have poor interpretability, thus failing to identify root-cause features without human supervision. These difficulties inevitably limit the performance and credibility of CNN methods. In this paper, we propose an order-invariant and interpretable hierarchical dilated convolution neural network (HDLCNN), which is composed by feature clustering, dilated convolution and the shapley additive explanations (SHAP) method. The novelty of HDLCNN lies in its capability of processing tabular data with features of arbitrary order without seeking the optimal order, due to the ability to agglomerate correlated features of feature clustering and the large receptive field of dilated convolution. Then, the proposed method provides interpretability by including the SHAP values to quantify feature contribution. Therefore, the root-cause features can be identified as the features with the highest contribution. Computational experiments are conducted on the Tennessee Eastman chemical process benchmark dataset. Compared with the other methods, the proposed HDLCNN-SHAP method achieves better performance on processing tabular data with features of arbitrary order, detecting faults, and identifying the root-cause features.