AITopics | Arany, Adam

Collaborating Authors

Arany, Adam

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Temporal Distribution Shift in Real-World Pharmaceutical Data: Implications for Uncertainty Quantification in QSAR Models

Friesacher, Hannah Rosa, Svensson, Emma, Winiwarter, Susanne, Mervin, Lewis, Arany, Adam, Engkvist, Ola

arXiv.org Artificial IntelligenceFeb-6-2025

The estimation of uncertainties associated with predictions from quantitative structure-activity relationship (QSAR) models can accelerate the drug discovery process by identifying promising experiments and allowing an efficient allocation of resources. Several computational tools exist that estimate the predictive uncertainty in machine learning models. However, deviations from the i.i.d. setting have been shown to impair the performance of these uncertainty quantification methods. We use a real-world pharmaceutical dataset to address the pressing need for a comprehensive, large-scale evaluation of uncertainty estimation methods in the context of realistic distribution shifts over time. We investigate the performance of several uncertainty estimation methods, including ensemble-based and Bayesian approaches. Furthermore, we use this real-world setting to systematically assess the distribution shifts in label and descriptor space and their impact on the capability of the uncertainty estimation methods. Our study reveals significant shifts over time in both label and descriptor space and a clear connection between the magnitude of the shift and the nature of the assay. Moreover, we show that pronounced distribution shifts impair the performance of popular uncertainty estimation methods used in QSAR models. This work highlights the challenges of identifying uncertainty quantification methods that remain reliable under distribution shifts introduced by real-world data.

artificial intelligence, assay, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2502.03982

Country:

Europe (1.00)
North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
(2 more...)

Add feedback

Weakly Supervised Knowledge Transfer with Probabilistic Logical Reasoning for Object Detection

Oldenhof, Martijn, Arany, Adam, Moreau, Yves, De Brouwer, Edward

arXiv.org Machine LearningMar-9-2023

Training object detection models usually requires instance-level annotations, such as the positions and labels of all objects present in each image. Such supervision is unfortunately not always available and, more often, only image-level information is provided, also known as weak supervision. Recent works have addressed this limitation by leveraging knowledge from a richly annotated domain. However, the scope of weak supervision supported by these approaches has been very restrictive, preventing them to use all available information. In this work, we propose ProbKT, a framework based on probabilistic logical reasoning that allows to train object detection models with arbitrary types of weak supervision. We empirically show on different datasets that using all available information is beneficial as our ProbKT leads to significant improvement on target domain and better generalization compared to existing baselines. We also showcase the ability of our approach to handle complex logic statements as supervision signal.

artificial intelligence, machine learning, survey article, (19 more...)

arXiv.org Machine Learning

2303.05148

Country: Europe > Belgium > Flanders (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (0.97)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

SparseChem: Fast and accurate machine learning model for small molecules

Arany, Adam, Simm, Jaak, Oldenhof, Martijn, Moreau, Yves

arXiv.org Machine LearningMar-9-2022

SparseChem provides fast and accurate machine learning models for biochemical applications. Especially, the package supports very high-dimensional sparse inputs, e.g., millions of features and millions of compounds. It is possible to train classification, regression and censored regression models, or combination of them from command line. Additionally, the library can be accessed directly from Python. Source code and documentation is freely available under MIT License on GitHub.

artificial intelligence, machine learning, sparsechem, (15 more...)

arXiv.org Machine Learning

2203.04676

Country: Europe > Belgium > Flanders (0.16)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.48)
Law > Civil Rights & Constitutional Law (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)

Add feedback

Multilevel Gibbs Sampling for Bayesian Regression

Tavernier, Joris, Simm, Jaak, Arany, Adam, Meerbergen, Karl, Moreau, Yves

arXiv.org Machine LearningSep-25-2020

Bayesian regression remains a simple but effective tool based on Bayesian inference techniques. For large-scale applications, with complicated posterior distributions, Markov Chain Monte Carlo methods are applied. To improve the well-known computational burden of Markov Chain Monte Carlo approach for Bayesian regression, we developed a multilevel Gibbs sampler for Bayesian regression of linear mixed models. The level hierarchy of data matrices is created by clustering the features and/or samples of data matrices. Additionally, the use of correlated samples is investigated for variance reduction to improve the convergence of the Markov Chain. Testing on a diverse set of data sets, speed-up is achieved for almost all of them without significant loss in predictive performance.

artificial intelligence, bayesian inference, health & medicine, (17 more...)

arXiv.org Machine Learning

2009.12132

Country:

North America > United States > New York (0.14)
Europe > Belgium > Flanders (0.14)
North America > United States > Colorado (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

ChemGrapher: Optical Graph Recognition of Chemical Compounds by Deep Learning

Oldenhof, Martijn, Arany, Adam, Moreau, Yves, Simm, Jaak

arXiv.org Machine LearningFeb-23-2020

In drug discovery, knowledge of the graph structure of chemical compounds is essential. Many thousands of scientific articles in chemistry and pharmaceutical sciences have investigated chemical compounds, but in cases the details of the structure of these chemical compounds is published only as an images. A tool to analyze these images automatically and convert them into a chemical graph structure would be useful for many applications, such drug discovery. A few such tools are available and they are mostly derived from optical character recognition. However, our evaluation of the performance of those tools reveals that they make often mistakes in detecting the correct bond multiplicity and stereochemical information. In addition, errors sometimes even lead to missing atoms in the resulting graph. In our work, we address these issues by developing a compound recognition method based on machine learning. More specifically, we develop a deep neural network model for optical compound recognition. The deep learning solution presented here consists of a segmentation model, followed by three classification models that predict atom locations, bonds and charges. Furthermore, this model not only predicts the graph structure of the molecule but also produces all information necessary to relate each component of the resulting graph to the source image. This solution is scalable and could rapidly process thousands of images. Finally, we compare empirically the proposed method to a well-established tool and observe significant error reductions.

classification network, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

2002.09914

Country: Europe > Belgium (0.29)

Genre: Research Report (0.83)

Industry:

Materials > Chemicals (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Graph Informer Networks for Molecules

Simm, Jaak, Arany, Adam, De Brouwer, Edward, Moreau, Yves

arXiv.org Machine LearningJul-25-2019

In machine learning, chemical molecules are often represented by sparse high-dimensional vectorial fingerprints. However, a more natural mathematical object for molecule representation is a graph, which is much more challenging to handle from a machine learning perspective. In recent years, several deep learning architectures have been proposed to directly learn from the graph structure of chemical molecules, including graph convolution (Duvenaud et al., 2015) and graph gating networks (Li et al., 2015). Here, we introduce Graph Informer, a route-based multi-head attention mechanism inspired by transformer networks (Vaswani et al., 2017), which incorporates features for node pairs. We show empirically that the proposed method gives significant improvements over existing approaches in prediction tasks for 13C nuclear magnetic resonance spectra and for drug bioactivity. These results indicate that our method is well suited for both node-level and graph-level prediction tasks.

deep learning, graph, neural network, (19 more...)

arXiv.org Machine Learning

1907.11318

Country: Europe > Belgium > Flanders (0.15)

Genre: Research Report (0.83)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GRU-ODE-Bayes: Continuous modeling of sporadically-observed time series

De Brouwer, Edward, Simm, Jaak, Arany, Adam, Moreau, Yves

arXiv.org Machine LearningMay-29-2019

Modeling real-world multidimensional time series can be particularly challenging when these are sporadically observed (i.e., sampling is irregular both in time and across dimensions)--such as in the case of clinical patient data. To address these challenges, we propose (1) a continuous-time version of the Gated Recurrent Unit, building upon the recent Neural Ordinary Differential Equations (Chen et al., 2018), and (2) a Bayesian update network that processes the sporadic observations. We bring these two ideas together in our GRU-ODE-Bayes method. We then demonstrate that the proposed method encodes a continuity prior for the latent process and that it can exactly represent the Fokker-Planck dynamics of complex processes driven by a multidimensional stochastic differential equation. Additionally, empirical evaluation shows that our method outperforms the state of the art on both synthetic data and real-world data with applications in healthcare and climate forecast. What is more, the continuity prior is shown to be well suited for low number of samples settings.

deep learning, immunology, time series, (20 more...)

arXiv.org Machine Learning

1905.12374

Country: Europe > Belgium > Flanders (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Materials > Chemicals (0.93)
Health & Medicine > Therapeutic Area > Immunology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)

Add feedback

Deep Ensemble Tensor Factorization for Longitudinal Patient Trajectories Classification

De Brouwer, Edward, Simm, Jaak, Arany, Adam, Moreau, Yves

arXiv.org Machine LearningNov-28-2018

We present a generative approach to classify scarcely observed longitudinal patient trajectories. The available time series are represented as tensors and factorized using generative deep recurrent neural networks. The learned factors represent the patient data in a compact way and can then be used in a downstream classification task. For more robustness and accuracy in the predictions, we used an ensemble of those deep generative models to mimic Bayesian posterior sampling. We illustrate the performance of our architecture on an intensive-care case study of in-hospital mortality prediction with 96 longitudinal measurement types measured across the first 48-hour from admission. Our combination of generative and ensemble strategies achieves an AUC of over 0.85, and outperforms the SAPS-II mortality score and GRU baselines.

classification, deep learning, immunology, (22 more...)

arXiv.org Machine Learning

1811.10501

Country:

Europe > Belgium (0.17)
North America > United States (0.14)

Genre: Research Report (0.67)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Materials > Chemicals (0.94)
Health & Medicine > Therapeutic Area > Immunology (0.68)
Health & Medicine > Consumer Health (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Add feedback

Macau: Scalable Bayesian Multi-relational Factorization with Side Information using MCMC

Simm, Jaak, Arany, Adam, Zakeri, Pooya, Haber, Tom, Wegner, Jörg K., Chupakhin, Vladimir, Ceulemans, Hugo, Moreau, Yves

arXiv.org Machine LearningDec-17-2015

We propose Macau, a powerful and flexible Bayesian factorization method for heterogeneous data. Our model can factorize any set of entities and relations that can be represented by a relational model, including tensors and also multiple relations for each entity. Macau can also incorporate side information, specifically entity and relation features, which are crucial for predicting sparsely observed relations. Macau scales to millions of entity instances, hundred millions of observations, and sparse entity features with millions of dimensions. To achieve the scale up, we specially designed sampling procedure for entity and relation features that relies primarily on noise injection in linear regressions. We show performance and advanced features of Macau in a set of experiments, including challenging drug-protein activity prediction task.

bayesian inference, health & medicine, relation, (20 more...)

arXiv.org Machine Learning

1509.0461

Country:

Asia > Macao (1.00)
North America > United States > Oregon (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

Highly Scalable Tensor Factorization for Prediction of Drug-Protein Interaction Type

Arany, Adam, Simm, Jaak, Zakeri, Pooya, Haber, Tom, Wegner, Jörg K., Chupakhin, Vladimir, Ceulemans, Hugo, Moreau, Yves

arXiv.org Machine LearningDec-1-2015

The understanding of the type of inhibitory interaction plays an important role in drug design. Therefore, researchers are interested to know whether a drug has competitive or non-competitive interaction to particular protein targets. Method: to analyze the interaction types we propose factorization method Macau which allows us to combine different measurement types into a single tensor together with proteins and compounds. The compounds are characterized by high dimensional 2D ECFP fingerprints. The novelty of the proposed method is that using a specially designed noise injection MCMC sampler it can incorporate high dimensional side information, i.e., millions of unique 2D ECFP compound features, even for large scale datasets of millions of compounds. Without the side information, in this case, the tensor factorization would be practically futile. Results: using public IC50 and Ki data from ChEMBL we trained a model from where we can identify the latent subspace separating the two measurement types (IC50 and Ki). The results suggest the proposed method can detect the competitive inhibitory activity between compounds and proteins.

artificial intelligence, health & medicine, machine learning, (15 more...)

arXiv.org Machine Learning

1512.00315

Country:

Asia > Macao (0.27)
Europe (0.15)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback