AITopics | Regression

Collaborating Authors

Regression

News Overviews Instructional Materials AI-Alerts Classics

Achieving Well-Informed Decision-Making in Drug Discovery: A Comprehensive Calibration Study using Neural Network-Based Structure-Activity Models

Friesacher, Hannah Rosa, Engkvist, Ola, Mervin, Lewis, Moreau, Yves, Arany, Adam

arXiv.org Machine LearningJul-19-2024

In the drug discovery process, where experiments can be costly and time-consuming, computational models that predict drug-target interactions are valuable tools to accelerate the development of new therapeutic agents. Estimating the uncertainty inherent in these neural network predictions provides valuable information that facilitates optimal decision-making when risk assessment is crucial. However, such models can be poorly calibrated, which results in unreliable uncertainty estimates that do not reflect the true predictive uncertainty. In this study, we compare different metrics, including accuracy and calibration scores, used for model hyperparameter tuning to investigate which model selection strategy achieves well-calibrated models. Furthermore, we propose to use a computationally efficient Bayesian uncertainty estimation method named Bayesian Linear Probing (BLP), which generates Hamiltonian Monte Carlo (HMC) trajectories to obtain samples for the parameters of a Bayesian Logistic Regression fitted to the hidden layer of the baseline neural network. We report that BLP improves model calibration and achieves the performance of common uncertainty quantification methods by combining the benefits of uncertainty estimation and probability calibration methods. Finally, we show that combining post hoc calibration method with well-performing uncertainty quantification approaches can boost model accuracy and calibration.

calibration, calibration method, prediction, (16 more...)

arXiv.org Machine Learning

2407.14185

Country:

Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.05)
Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)

Add feedback

NODER: Image Sequence Regression Based on Neural Ordinary Differential Equations

Bai, Hao, Hong, Yi

arXiv.org Artificial IntelligenceJul-18-2024

Regression on medical image sequences can capture temporal image pattern changes and predict images at missing or future time points. However, existing geodesic regression methods limit their regression performance by a strong underlying assumption of linear dynamics, while diffusion-based methods have high computational costs and lack constraints to preserve image topology. In this paper, we propose an optimization-based new framework called NODER, which leverages neural ordinary differential equations to capture complex underlying dynamics and reduces its high computational cost of handling high-dimensional image volumes by introducing the latent space. We compare our NODER with two recent regression methods, and the experimental results on ADNI and ACDC datasets demonstrate that our method achieves the state-of-the-art performance in 3D image regression. Our model needs only a couple of images in a sequence for prediction, which is practical, especially for clinical situations where extremely limited image time series are available for analysis.

image sequence, regression, sequence, (13 more...)

arXiv.org Artificial Intelligence

2407.13241

Country:

Asia > China > Shanghai > Shanghai (0.05)
North America > United States > Tennessee > Davidson County > Nashville (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Distributionally and Adversarially Robust Logistic Regression via Intersecting Wasserstein Balls

Selvi, Aras, Kreacic, Eleonora, Ghassemi, Mohsen, Potluru, Vamsi, Balch, Tucker, Veloso, Manuela

arXiv.org Artificial IntelligenceJul-18-2024

Empirical risk minimization often fails to provide robustness against adversarial attacks in test data, causing poor out-of-sample performance. Adversarially robust optimization (ARO) has thus emerged as the de facto standard for obtaining models that hedge against such attacks. However, while these models are robust against adversarial attacks, they tend to suffer severely from overfitting. To address this issue for logistic regression, we study the Wasserstein distributionally robust (DR) counterpart of ARO and show that this problem admits a tractable reformulation. Furthermore, we develop a framework to reduce the conservatism of this problem by utilizing an auxiliary dataset (e.g., synthetic, external, or out-of-domain data), whenever available, with instances independently sampled from a nonidentical but related ground truth. In particular, we intersect the ambiguity set of the DR problem with another Wasserstein ambiguity set that is built using the auxiliary dataset. We analyze the properties of the underlying optimization problem, develop efficient solution algorithms, and demonstrate that the proposed method consistently outperforms benchmark approaches on real-world datasets.

auxiliary data, dataset, inter-aro, (16 more...)

arXiv.org Artificial Intelligence

2407.13625

Country: Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.70)

Industry:

Information Technology > Security & Privacy (0.86)
Education (0.67)
Government > Military (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)

Add feedback

From 2015 to 2023: How Machine Learning Aids Natural Product Analysis

Shi, Suwen, Huang, Ziwei, Gu, Xingxin, Lin, Xu, Zhong, Chaoying, Hang, Junjie, Lin, Jianli, Zhong, Claire Chenwen, Zhang, Lin, Li, Yu, Huang, Junjie

arXiv.org Artificial IntelligenceJul-17-2024

In recent years, conventional chemistry techniques have faced significant challenges due to their inherent limitations, struggling to cope with the increasing complexity and volume of data generated in contemporary research endeavors. Computational methodologies represent robust tools in the field of chemistry, offering the capacity to harness potent machine-learning models to yield insightful analytical outcomes. This review delves into the spectrum of computational strategies available for natural product analysis and constructs a research framework for investigating both qualitative and quantitative chemistry problems. Our objective is to present a novel perspective on the symbiosis of machine learning and chemistry, with the potential to catalyze a transformation in the field of natural product analysis.

natural product analysis, prediction, spectroscopy, (12 more...)

arXiv.org Artificial Intelligence

2408.00793

Country:

Asia > China > Hong Kong (0.05)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Asia > India (0.04)
(3 more...)

Genre: Research Report > Experimental Study (0.67)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.64)
Health & Medicine > Therapeutic Area > Immunology (0.50)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

A Survey of AI-Powered Mini-Grid Solutions for a Sustainable Future in Rural Communities

Pirie, Craig, Kalutarage, Harsha, Hajar, Muhammad Shadi, Wiratunga, Nirmalie, Charles, Subodha, Madhushan, Geeth Sandaru, Buddhika, Priyantha, Wijesiriwardana, Supun, Dimantha, Akila, Hansamal, Kithdara, Pathiranage, Shalitha

arXiv.org Artificial IntelligenceJul-17-2024

This paper presents a comprehensive survey of AI-driven mini-grid solutions aimed at enhancing sustainable energy access. It emphasises the potential of mini-grids, which can operate independently or in conjunction with national power grids, to provide reliable and affordable electricity to remote communities. Given the inherent unpredictability of renewable energy sources such as solar and wind, the necessity for accurate energy forecasting and management is discussed, highlighting the role of advanced AI techniques in forecasting energy supply and demand, optimising grid operations, and ensuring sustainable energy distribution. This paper reviews various forecasting models, including statistical methods, machine learning algorithms, and hybrid approaches, evaluating their effectiveness for both short-term and long-term predictions. Additionally, it explores public datasets and tools such as Prophet, NeuralProphet, and N-BEATS for model implementation and validation. The survey concludes with recommendations for future research, addressing challenges in model adaptation and optimisation for real-world applications.

forecast, forecasting, neural network, (15 more...)

arXiv.org Artificial Intelligence

2407.15865

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
Oceania > Australia > Western Australia (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(16 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)

Industry:

Energy > Renewable > Wind (1.00)
Energy > Renewable > Solar (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

Add feedback

A survey and taxonomy of methods interpreting random forest models

Haddouchi, Maissae, Berrado, Abdelaziz

arXiv.org Artificial IntelligenceJul-17-2024

The interpretability of random forest (RF) models is a research topic of growing interest in the machine learning (ML) community. In the state of the art, RF is considered a powerful learning ensemble given its predictive performance, flexibility, and ease of use. Furthermore, the inner process of the RF model is understandable because it uses an intuitive and intelligible approach for building the RF decision tree ensemble. However, the RF resulting model is regarded as a "black box" because of its numerous deep decision trees. Gaining visibility over the entire process that induces the final decisions by exploring each decision tree is complicated, if not impossible. This complexity limits the acceptance and implementation of RF models in several fields of application. Several papers have tackled the interpretation of RF models. This paper aims to provide an extensive review of methods used in the literature to interpret RF resulting models. We have analyzed these methods and classified them based on different axes. Although this review is not exhaustive, it provides a taxonomy of various techniques that should guide users in choosing the most appropriate tools for interpreting RF models, depending on the interpretability aspects sought. It should also be valuable for researchers who aim to focus their work on the interpretability of RF or ML black boxes in general.

explanation, rf model, survey and taxonomy, (13 more...)

arXiv.org Artificial Intelligence

2407.12759

Country:

Europe > Switzerland (0.04)
Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
Africa > Middle East > Morocco > Rabat-Salé-Kénitra Region > Rabat (0.04)
(7 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.92)
Research Report > Experimental Study (0.92)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Conditional Quantile Estimation for Uncertain Watch Time in Short-Video Recommendation

Lin, Chengzhi, Liu, Shuchang, Wang, Chuyuan, Liu, Yongqi

arXiv.org Artificial IntelligenceJul-16-2024

Within the domain of short video recommendation, predicting users' watch time is a critical but challenging task. Prevailing deterministic solutions obtain accurate debiased statistical models, yet they neglect the intrinsic uncertainty inherent in user environments. In our observation, we found that this uncertainty could potentially limit these methods' accuracy in watch-time prediction on our online platform, despite that we have employed numerous features and complex network architectures. Consequently, we believe that a better solution is to model the conditional distribution of this uncertain watch time. In this paper, we introduce a novel estimation technique -- Conditional Quantile Estimation (CQE), which utilizes quantile regression to capture the nuanced distribution of watch time. The learned distribution accounts for the stochastic nature of users, thereby it provides a more accurate and robust estimation. In addition, we also design several strategies to enhance the quantile prediction including conditional expectation, conservative estimation, and dynamic quantile combination. We verify the effectiveness of our method through extensive offline evaluations using public datasets as well as deployment in a real-world video application with over 300 million daily active users.

conditional quantile estimation, quantile, watch time, (11 more...)

arXiv.org Artificial Intelligence

2407.12223

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > China > Beijing > Beijing (0.05)
Oceania > Australia > Victoria > Melbourne (0.04)
(12 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.95)
Information Technology > Data Science > Data Mining (0.94)
Information Technology > Information Management (0.93)
(2 more...)

Add feedback

Detection of Malaria Vector Breeding Habitats using Topographic Models

Jadhav, Aishwarya

arXiv.org Artificial IntelligenceJul-16-2024

Treatment of stagnant water bodies that act as a breeding site for malarial vectors is a fundamental step in most malaria elimination campaigns. However, identification of such water bodies over large areas is expensive, labour-intensive and time-consuming and hence, challenging in countries with limited resources. Practical models that can efficiently locate water bodies can target the limited resources by greatly reducing the area that needs to be scanned by the field workers. To this end, we propose a practical topographic model based on easily available, global, high-resolution DEM data to predict locations of potential vector-breeding water sites. We surveyed the Obuasi region of Ghana to assess the impact of various topographic features on different types of water bodies and uncover the features that significantly influence the formation of aquatic habitats. We further evaluate the effectiveness of multiple models. Our best model significantly outperforms earlier attempts that employ topographic variables for detection of small water sites, even the ones that utilize additional satellite imagery data and demonstrates robustness across different settings.

dataset, topographic feature, water body, (13 more...)

arXiv.org Artificial Intelligence

2011.13714

Country:

North America > United States (0.29)
Africa > Ghana (0.25)
Africa > Kenya > Western Province (0.05)
(3 more...)

Genre: Research Report > Experimental Study (0.48)

Industry:

Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases > Vector-Borne Disease (0.65)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

Are Linear Regression Models White Box and Interpretable?

Salih, Ahmed M, Wang, Yuhe

arXiv.org Machine LearningJul-16-2024

Explainable artificial intelligence (XAI) is a set of tools and algorithms that applied or embedded to machine learning models to understand and interpret the models. They are recommended especially for complex or advanced models including deep neural network because they are not interpretable from human point of view. On the other hand, simple models including linear regression are easy to implement, has less computational complexity and easy to visualize the output. The common notion in the literature that simple models including linear regression are considered as "white box" because they are more interpretable and easier to understand. This is based on the idea that linear regression models have several favorable outcomes including the effect of the features in the model and whether they affect positively or negatively toward model output. Moreover, uncertainty of the model can be measured or estimated using the confidence interval. However, we argue that this perception is not accurate and linear regression models are not easy to interpret neither easy to understand considering common XAI metrics and possible challenges might face. This includes linearity, local explanation, multicollinearity, covariates, normalization, uncertainty, features contribution and fairness. Consequently, we recommend the so-called simple models should be treated equally to complex models when it comes to explainability and interpretability.

linear regression model white box, model white box and interpretable

arXiv.org Machine Learning

2407.12177

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Bayesian Causal Forests for Longitudinal Data: Assessing the Impact of Part-Time Work on Growth in High School Mathematics Achievement

McJames, Nathan, O'Shea, Ann, Parnell, Andrew

arXiv.org Machine LearningJul-16-2024

Modelling growth in student achievement is a significant challenge in the field of education. Understanding how interventions or experiences such as part-time work can influence this growth is also important. Traditional methods like difference-in-differences are effective for estimating causal effects from longitudinal data. Meanwhile, Bayesian non-parametric methods have recently become popular for estimating causal effects from single time point observational studies. However, there remains a scarcity of methods capable of combining the strengths of these two approaches to flexibly estimate heterogeneous causal effects from longitudinal data. Motivated by two waves of data from the High School Longitudinal Study, the NCES' most recent longitudinal study which tracks a representative sample of over 20,000 students in the US, our study introduces a longitudinal extension of Bayesian Causal Forests. This model allows for the flexible identification of both individual growth in mathematical ability and the effects of participation in part-time work. Simulation studies demonstrate the predictive performance and reliable uncertainty quantification of the proposed model. Results reveal the negative impact of part time work for most students, but hint at potential benefits for those students with an initially low sense of school belonging. Clear signs of a widening achievement gap between students with high and low academic achievement are also identified. Potential policy implications are discussed, along with promising areas for future research.

part-time work, student, student achievement, (17 more...)

arXiv.org Machine Learning

2407.11927

Country:

Europe > Ireland (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
North America > Canada > Ontario (0.04)

Genre:

Research Report > Strength Medium (1.00)
Research Report > New Finding (1.00)
Research Report > Observational Study (0.74)

Industry: Education > Educational Setting > K-12 Education > Secondary School (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback