AITopics | Regression

Collaborating Authors

Regression

News Overviews Instructional Materials AI-Alerts Classics

Rethinking recidivism through a causal lens

Shirvaikar, Vik, Lakshminarayan, Choudur

arXiv.org Artificial IntelligenceMay-8-2024

Predictive modeling of criminal recidivism, or whether people will re-offend in the future, has a long and contentious history. Modern causal inference methods allow us to move beyond prediction and target the "treatment effect" of a specific intervention on an outcome in an observational dataset. In this paper, we look specifically at the effect of incarceration (prison time) on recidivism, using a well-known dataset from North Carolina. Two popular causal methods for addressing confounding bias are explained and demonstrated: directed acyclic graph (DAG) adjustment and double machine learning (DML), including a sensitivity analysis for unobserved confounders. We find that incarceration has a detrimental effect on recidivism, i.e., longer prison sentences make it more likely that individuals will re-offend after release, although this conclusion should not be generalized beyond the scope of our data. We hope that this case study can inform future applications of causal inference to criminal justice analysis.

causal effect, probability, recidivism, (14 more...)

arXiv.org Artificial Intelligence

2011.11483

Country:

North America > United States > North Carolina (0.25)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
(8 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Corrections (1.00)
Health & Medicine (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.93)

Technology:

Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
(2 more...)

Add feedback

Enhancing Social Media Post Popularity Prediction with Visual Content

Jeong, Dahyun, Son, Hyelim, Choi, Yunjin, Kim, Keunwoo

arXiv.org Artificial IntelligenceMay-8-2024

Our study presents a framework for predicting image-based social media content popularity that focuses on addressing complex image information and a hierarchical data structure. We utilize the Google Cloud Vision API to effectively extract key image and color information from users' postings, achieving 6.8% higher accuracy compared to using non-image covariates alone. For prediction, we explore a wide range of prediction models, including Linear Mixed Model, Support Vector Regression, Multi-layer Perceptron, Random Forest, and XGBoost, with linear regression as the benchmark. Our comparative study demonstrates that models that are capable of capturing the underlying nonlinear interactions between covariates outperform other methods.

covariate, information, popularity prediction, (11 more...)

arXiv.org Artificial Intelligence

2405.02367

Country:

Asia > South Korea > Seoul > Seoul (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Services (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

Add feedback

Uncertainty quantification in metric spaces

Lugosi, Gábor, Matabuena, Marcos

arXiv.org Machine LearningMay-8-2024

This paper introduces a novel uncertainty quantification framework for regression models where the response takes values in a separable metric space, and the predictors are in a Euclidean space. The proposed algorithms can efficiently handle large datasets and are agnostic to the predictive base model used. Furthermore, the algorithms possess asymptotic consistency guarantees and, in some special homoscedastic cases, we provide non-asymptotic guarantees. To illustrate the effectiveness of the proposed uncertainty quantification framework, we use a linear regression model for metric responses (known as the global Fr\'echet model) in various clinical applications related to precision and digital medicine. The different clinical outcomes analyzed are represented as complex statistical objects, including multivariate Euclidean data, Laplacian graphs, and probability distributions.

algorithm, application, regression model, (12 more...)

arXiv.org Machine Learning

2405.0511

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Spain > Galicia > A Coruña Province > Santiago de Compostela (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)
Health & Medicine > Health Care Technology (1.00)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Cross-IQA: Unsupervised Learning for Image Quality Assessment

Zhang, Zhen

arXiv.org Artificial IntelligenceMay-7-2024

Automatic perception of image quality is a challenging problem that impacts billions of Internet and social media users daily. To advance research in this field, we propose a no-reference image quality assessment (NR-IQA) method termed Cross-IQA based on vision transformer(ViT) model. The proposed Cross-IQA method can learn image quality features from unlabeled image data. We construct the pretext task of synthesized image reconstruction to unsupervised extract the image quality information based ViT block. The pretrained encoder of Cross-IQA is used to fine-tune a linear regression model for score prediction. Experimental results show that Cross-IQA can achieve state-of-the-art performance in assessing the low-frequency degradation information (e.g., color change, blurring, etc.) of images compared with the classical full-reference IQA and NR-IQA under the same datasets.

image quality assessment, information, quality assessment, (11 more...)

arXiv.org Artificial Intelligence

2405.04311

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(6 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology (0.66)
Education (0.47)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.55)

Add feedback

Using Pre-training and Interaction Modeling for ancestry-specific disease prediction in UK Biobank

Menestrel, Thomas Le, Craig, Erin, Tibshirani, Robert, Hastie, Trevor, Rivas, Manuel

arXiv.org Artificial IntelligenceMay-7-2024

Recent genome-wide association studies (GWAS) have uncovered the genetic basis of complex traits, but show an under-representation of non-European descent individuals, underscoring a critical gap in genetic research. Here, we assess whether we can improve disease prediction across diverse ancestries using multiomic data. We evaluate the performance of Group-LASSO INTERaction-NET (glinternet) and pretrained lasso in disease prediction focusing on diverse ancestries in the UK Biobank. Models were trained on data from White British and other ancestries and validated across a cohort of over 96,000 individuals for 8 diseases. Out of 96 models trained, we report 16 with statistically significant incremental predictive performance in terms of ROC-AUC scores (p-value < 0.05), found for diabetes, arthritis, gall stones, cystitis, asthma and osteoarthritis. For the interaction and pretrained models that outperformed the baseline, the PRS score was the primary driver behind prediction. Our findings indicate that both interaction terms and pre-training can enhance prediction accuracy but for a limited set of diseases and moderate improvements in accuracy.

ancestry, dataset, glinternet, (11 more...)

arXiv.org Artificial Intelligence

2404.17626

Country:

Europe > United Kingdom (0.49)
North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.32)

Add feedback

A review on data-driven constitutive laws for solids

Fuhg, Jan Niklas, Padmanabha, Govinda Anantha, Bouklas, Nikolaos, Bahmani, Bahador, Sun, WaiChing, Vlassis, Nikolaos N., Flaschel, Moritz, Carrara, Pietro, De Lorenzis, Laura

arXiv.org Artificial IntelligenceMay-6-2024

This review article highlights state-of-the-art data-driven techniques to discover, encode, surrogate, or emulate constitutive laws that describe the path-independent and path-dependent response of solids. Our objective is to provide an organized taxonomy to a large spectrum of methodologies developed in the past decades and to discuss the benefits and drawbacks of the various techniques for interpreting and forecasting mechanics behavior across different scales. Distinguishing between machine-learning-based and model-free methods, we further categorize approaches based on their interpretability and on their learning process/type of required data, while discussing the key problems of generalization and trustworthiness. We attempt to provide a road map of how these can be reconciled in a data-availability-aware context. We also touch upon relevant aspects such as data sampling techniques, design of experiments, verification, and validation.

evolutionary algorithm, machine learning, reinforcement learning, (22 more...)

arXiv.org Artificial Intelligence

2405.03658

Country:

Europe (1.00)
Asia (0.67)
North America > United States > Texas (0.27)

Genre:

Overview (1.00)
Research Report > New Finding (0.92)

Industry:

Energy > Oil & Gas > Upstream (1.00)
Education (1.00)
Government > Regional Government (0.92)
Materials > Metals & Mining (0.67)

Technology:

Information Technology > Software (1.00)
Information Technology > Modeling & Simulation (1.00)
Information Technology > Mathematics of Computing (1.00)
(10 more...)

Add feedback

Investigating Personalized Driving Behaviors in Dilemma Zones: Analysis and Prediction of Stop-or-Go Decisions

Qin, Ziye, Li, Siyan, Wu, Guoyuan, Barth, Matthew J., Abdelraouf, Amr, Gupta, Rohit, Han, Kyungtae

arXiv.org Artificial IntelligenceMay-6-2024

Dilemma zones at signalized intersections present a commonly occurring but unsolved challenge for both drivers and traffic operators. Onsets of the yellow lights prompt varied responses from different drivers: some may brake abruptly, compromising the ride comfort, while others may accelerate, increasing the risk of red-light violations and potential safety hazards. Such diversity in drivers' stop-or-go decisions may result from not only surrounding traffic conditions, but also personalized driving behaviors. To this end, identifying personalized driving behaviors and integrating them into advanced driver assistance systems (ADAS) to mitigate the dilemma zone problem presents an intriguing scientific question. In this study, we employ a game engine-based (i.e., CARLA-enabled) driving simulator to collect high-resolution vehicle trajectories, incoming traffic signal phase and timing information, and stop-or-go decisions from four subject drivers in various scenarios. This approach allows us to analyze personalized driving behaviors in dilemma zones and develop a Personalized Transformer Encoder to predict individual drivers' stop-or-go decisions. The results show that the Personalized Transformer Encoder improves the accuracy of predicting driver decision-making in the dilemma zone by 3.7% to 12.6% compared to the Generic Transformer Encoder, and by 16.8% to 21.6% over the binary logistic regression model.

artificial intelligence, dilemma zone, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2405.03873

Country:

North America > United States > Texas (0.04)
North America > United States > California > Santa Clara County > Mountain View (0.04)
North America > United States > California > Riverside County > Riverside (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Ground > Road (0.90)
Transportation > Infrastructure & Services (0.72)

Technology:

Information Technology > Artificial Intelligence > Robots (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.56)

Add feedback

Large Language Models Can Automatically Engineer Features for Few-Shot Tabular Learning

Han, Sungwon, Yoon, Jinsung, Arik, Sercan O, Pfister, Tomas

arXiv.org Artificial IntelligenceMay-6-2024

Large Language Models (LLMs), with their remarkable ability to tackle challenging and unseen reasoning problems, hold immense potential for tabular learning, that is vital for many real-world applications. In this paper, we propose a novel in-context learning framework, FeatLLM, which employs LLMs as feature engineers to produce an input data set that is optimally suited for tabular predictions. The generated features are used to infer class likelihood with a simple downstream machine learning model, such as linear regression and yields high performance few-shot learning. The proposed FeatLLM framework only uses this simple predictive model with the discovered features at inference time. Compared to existing LLM-based approaches, FeatLLM eliminates the need to send queries to the LLM for each sample at inference time. Moreover, it merely requires API-level access to LLMs, and overcomes prompt size limitations. As demonstrated across numerous tabular datasets from a wide range of domains, FeatLLM generates high-quality rules, significantly (10% on average) outperforming alternatives such as TabLLM and STUNT.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2404.09491

Country:

Europe > Austria > Vienna (0.14)
South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > California > Santa Clara County > Sunnyvale (0.04)
Asia > South Korea > Daejeon > Daejeon (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Therapeutic Area > Endocrinology (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Analyzing Emotional Trends from X platform using SenticNet: A Comparative Analysis with Cryptocurrency Price

Tash, Moein Shahiki, Ahani, Zahra, Kolesnikova, Olga, Sidorov, Grigori

arXiv.org Artificial IntelligenceMay-5-2024

This study delves into the relationship between emotional trends from X platform data and the market dynamics of well-known cryptocurrencies Cardano, Binance, Fantom, Matic, and Ripple over the period from October 2022 to March 2023. Leveraging SenticNet, we identified emotions like Fear and Anxiety, Rage and Anger, Grief and Sadness, Delight and Pleasantness, Enthusiasm and Eagerness, and Delight and Joy. Following data extraction, we segmented each month into bi-weekly intervals, replicating this process for price data obtained from Finance-Yahoo. Consequently, a comparative analysis was conducted, establishing connections between emotional trends observed across bi-weekly intervals and cryptocurrency prices, uncovering significant correlations between emotional sentiments and coin valuations.

correlation, cryptocurrency, enthusiasm and eagerness, (14 more...)

arXiv.org Artificial Intelligence

2405.03084

Country:

Asia > China (0.04)
South America (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Banking & Finance > Trading (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.70)

Technology:

Information Technology > e-Commerce > Financial Technology (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Reliable Interval Prediction of Minimum Operating Voltage Based on On-chip Monitors via Conformalized Quantile Regression

Yin, Yuxuan, Wang, Xiaoxiao, Chen, Rebecca, He, Chen, Li, Peng

arXiv.org Artificial IntelligenceMay-3-2024

Predicting the minimum operating voltage ($V_{min}$) of chips is one of the important techniques for improving the manufacturing testing flow, as well as ensuring the long-term reliability and safety of in-field systems. Current $V_{min}$ prediction methods often provide only point estimates, necessitating additional techniques for constructing prediction confidence intervals to cover uncertainties caused by different sources of variations. While some existing techniques offer region predictions, but they rely on certain distributional assumptions and/or provide no coverage guarantees. In response to these limitations, we propose a novel distribution-free $V_{min}$ interval estimation methodology possessing a theoretical guarantee of coverage. Our approach leverages conformalized quantile regression and on-chip monitors to generate reliable prediction intervals. We demonstrate the effectiveness of the proposed method on an industrial 5nm automotive chip dataset. Moreover, we show that the use of on-chip monitors can reduce the interval length significantly for $V_{min}$ prediction.

prediction, predictor, read point, (12 more...)

arXiv.org Artificial Intelligence

2406.18536

Country:

Oceania > New Zealand > North Island > Waikato (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.32)

Add feedback