AITopics

2411.11511

Country:

Europe > United Kingdom (0.14)
South America > Chile > Valparaíso Region > Valparaíso Province > Valparaíso (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
(3 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Duran-Martin, Gerardo, Sánchez-Betancourt, Leandro, Shestopaloff, Alexander Y., Murphy, Kevin

BONE: a unifying framework for Bayesian online learning in non-stationary environments

arXiv.org Machine LearningNov-18-2024

We propose a unifying framework for methods that perform Bayesian online learning in non-stationary environments. We call the framework BONE, which stands for (B)ayesian (O)nline learning in (N)on-stationary (E)nvironments. BONE provides a common structure to tackle a variety of problems, including online continual learning, prequential forecasting, and contextual bandits. The framework requires specifying three modelling choices: (i) a model for measurements (e.g., a neural network), (ii) an auxiliary process to model non-stationarity (e.g., the time since the last changepoint), and (iii) a conditional prior over model parameters (e.g., a multivariate Gaussian). The framework also requires two algorithmic choices, which we use to carry out approximate inference under this framework: (i) an algorithm to estimate beliefs (posterior distribution) about the model parameters given the auxiliary variable, and (ii) an algorithm to estimate beliefs about the auxiliary variable. We show how this modularity allows us to write many different existing methods as instances of BONE; we also use this framework to propose a new method. We then experimentally compare existing methods with our proposed new method on several datasets; we provide insights into the situations that make one method more suitable than another for a given task.

changepoint, hypothesis, prediction, (17 more...)

2411.10153

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Tyagi, Kanishka, Rane, Chinmay, Vaidya, Ketaki, Challgundla, Jeshwanth, Auddy, Soumitro Swapan, Manry, Michael

Making Sigmoid-MSE Great Again: Output Reset Challenges Softmax Cross-Entropy in Neural Network Classification

arXiv.org Machine LearningNov-17-2024

This study presents a comparative analysis of two objective functions, Mean Squared Error (MSE) and Softmax Cross-Entropy (SCE) for neural network classification tasks. While SCE combined with softmax activation is the conventional choice for transforming network outputs into class probabilities, we explore an alternative approach using MSE with sigmoid activation. We introduce the Output Reset algorithm, which reduces inconsistent errors and enhances classifier robustness. Through extensive experiments on benchmark datasets (MNIST, CIFAR-10, and Fashion-MNIST), we demonstrate that MSE with sigmoid activation achieves comparable accuracy and convergence rates to SCE, while exhibiting superior performance in scenarios with noisy data. Our findings indicate that MSE, despite its traditional association with regression tasks, serves as a viable alternative for classification problems, challenging conventional wisdom about neural network training strategies.

artificial intelligence, classifier, machine learning, (19 more...)

2411.11213

Country:

North America > United States > Texas > Tarrant County > Arlington (0.05)
South America > Uruguay > Maldonado > Maldonado (0.04)
North America > United States > California > Santa Clara County > Santa Clara (0.04)
(2 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

arXiv.org Artificial IntelligenceNov-15-2024

Emotion Detection in Reddit: Comparative Study of Machine Learning and Deep Learning Techniques

Alaeddini, Maliheh

Emotion detection is pivotal in human communication, as it significantly influences behavior, relationships, and decision-making processes. This study concentrates on text-based emotion detection by leveraging the GoEmotions dataset, which annotates Reddit comments with 27 distinct emotions. These emotions are subsequently mapped to Ekman's 6 basic categories: joy, anger, fear, sadness, disgust, and surprise. We employed a range of models for this task, including 6 machine learning models, 3 ensemble models, and Long Short-Term Memory (LSTM) model to determine the optimal model for emotion detection. Results indicate that the Stacking classifier outperforms other models in accuracy and performance. Finally, the Stacking classifier is deployed via a Streamlit web application, underscoring its potential for real-world applications in text-based emotion analysis. Keywords: Text Based Emotion Detection, Machine Learning, Ensemble Learning, Deep Learning, GoEmotions, EmoBERTa, Streamlit Introduction Emotions are complex, subjective experiences, often linked to psychological states such as mood, temperament, and personality. These experiences influence human behavior, impacting decision-making, reactions to stimuli, and interpersonal interactions. In the contemporary world, where mental health disorders such as stress, anxiety, and depression are increasingly prevalent, understanding emotions is more important than ever (Maruf et al., 2024).

artificial intelligence, emotion, machine learning, (15 more...)

2411.10328

Country:

Asia > Middle East > UAE > Dubai Emirate > Dubai (0.04)
Asia > Middle East > Jordan > Irbid Governorate > Irbid (0.04)
Asia > Macao (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.49)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

arXiv.org Machine LearningNov-15-2024

Drift-Resilient TabPFN: In-Context Learning Temporal Distribution Shifts on Tabular Data

Helli, Kai, Schnurr, David, Hollmann, Noah, Müller, Samuel, Hutter, Frank

While most ML models expect independent and identically distributed data, this assumption is often violated in real-world scenarios due to distribution shifts, resulting in the degradation of machine learning model performance. Until now, no tabular method has consistently outperformed classical supervised learning, which ignores these shifts. To address temporal distribution shifts, we present Drift-Resilient TabPFN, a fresh approach based on In-Context Learning with a Prior-Data Fitted Network that learns the learning algorithm itself: it accepts the entire training dataset as input and makes predictions on the test set in a single forward pass. Specifically, it learns to approximate Bayesian inference on synthetic datasets drawn from a prior that specifies the model's inductive bias. This prior is based on structural causal models (SCM), which gradually shift over time. To model shifts of these causal models, we use a secondary SCM, that specifies changes in the primary model parameters. The resulting Drift-Resilient TabPFN can be applied to unseen data, runs in seconds on small to moderately sized datasets and needs no hyperparameter tuning. Comprehensive evaluations across 18 synthetic and real-world datasets demonstrate large performance improvements over a wide range of baselines, such as XGB, CatBoost, TabPFN, and applicable methods featured in the Wild-Time benchmark. Compared to the strongest baselines, it improves accuracy from 0.688 to 0.744 and ROC AUC from 0.786 to 0.832 while maintaining stronger calibration. This approach could serve as significant groundwork for further research on out-of-distribution prediction.

artificial intelligence, bayesian inference, machine learning, (17 more...)

2411.10634

Country:

Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
Oceania > Australia > New South Wales (0.04)
(13 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)
Banking & Finance (0.92)
Transportation (0.92)
Health & Medicine > Therapeutic Area > Endocrinology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

arXiv.org Artificial IntelligenceNov-15-2024

Model Inversion Attacks: A Survey of Approaches and Countermeasures

Zhou, Zhanke, Zhu, Jianing, Yu, Fengfei, Li, Xuan, Peng, Xiong, Liu, Tongliang, Han, Bo

The success of deep neural networks has driven numerous research studies and applications from Euclidean to non-Euclidean data. However, there are increasing concerns about privacy leakage, as these networks rely on processing private data. Recently, a new type of privacy attack, the model inversion attacks (MIAs), aims to extract sensitive features of private data for training by abusing access to a well-trained model. The effectiveness of MIAs has been demonstrated in various domains, including images, texts, and graphs. These attacks highlight the vulnerability of neural networks and raise awareness about the risk of privacy leakage within the research community. Despite the significance, there is a lack of systematic studies that provide a comprehensive overview and deeper insights into MIAs across different domains. This survey aims to summarize up-to-date MIA methods in both attacks and defenses, highlighting their contributions and limitations, underlying modeling principles, optimization challenges, and future directions. We hope this survey bridges the gap in the literature and facilitates future research in this critical area. Besides, we are maintaining a repository to keep track of relevant research at https://github.com/AndrewZhou924/Awesome-model-inversion-attack.

information, large language model, machine learning, (20 more...)

2411.10023

Country:

Asia > China > Hong Kong (0.05)
Oceania > Australia > New South Wales > Sydney (0.04)
South America > Brazil (0.04)
(2 more...)

Genre: Overview > Growing Problem (0.48)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(6 more...)

arXiv.org Machine LearningNov-15-2024

Continuous Bayesian Model Selection for Multivariate Causal Discovery

Dhir, Anish, Sedgwick, Ruby, Kori, Avinash, Glocker, Ben, van der Wilk, Mark

Current causal discovery approaches require restrictive model assumptions or assume access to interventional data to ensure structure identifiability. These assumptions often do not hold in real-world applications leading to a loss of guarantees and poor accuracy in practice. Recent work has shown that, in the bivariate case, Bayesian model selection can greatly improve accuracy by exchanging restrictive modelling for more flexible assumptions, at the cost of a small probability of error. We extend the Bayesian model selection approach to the important multivariate setting by making the large discrete selection problem scalable through a continuous relaxation. We demonstrate how for our choice of Bayesian non-parametric model, the Causal Gaussian Process Conditional Density Estimator (CGP-CDE), an adjacency matrix can be constructed from the model hyperparameters. This adjacency matrix is then optimised using the marginal likelihood and an acyclicity regulariser, outputting the maximum a posteriori causal graph. We demonstrate the competitiveness of our approach on both synthetic and real-world datasets, showing it is possible to perform multivariate causal discovery without infeasible assumptions using Bayesian model selection.

artificial intelligence, bayesian inference, machine learning, (13 more...)

2411.10154

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Orozco, Rafael, Erdinc, Huseyin Tuna, Zeng, Yunlin, Louboutin, Mathias, Herrmann, Felix J.

Machine learning-enabled velocity model building with uncertainty quantification

arXiv.org Artificial IntelligenceNov-14-2024

Accurately characterizing migration velocity models is crucial for a wide range of geophysical applications, from hydrocarbon exploration to monitoring of CO2 sequestration projects. Traditional velocity model building methods such as Full-Waveform Inversion (FWI) are powerful but often struggle with the inherent complexities of the inverse problem, including noise, limited bandwidth, receiver aperture and computational constraints. To address these challenges, we propose a scalable methodology that integrates generative modeling, in the form of Diffusion networks, with physics-informed summary statistics, making it suitable for complicated imaging problems including field datasets. By defining these summary statistics in terms of subsurface-offset image volumes for poor initial velocity models, our approach allows for computationally efficient generation of Bayesian posterior samples for migration velocity models that offer a useful assessment of uncertainty. To validate our approach, we introduce a battery of tests that measure the quality of the inferred velocity models, as well as the quality of the inferred uncertainties. With modern synthetic datasets, we reconfirm gains from using subsurface-image gathers as the conditioning observable. For complex velocity model building involving salt, we propose a new iterative workflow that refines amortized posterior approximations with salt flooding and demonstrate how the uncertainty in the velocity model can be propagated to the final product reverse time migrated images. Finally, we present a proof of concept on field datasets to show that our method can scale to industry-sized problems.

artificial intelligence, machine learning, velocity model, (15 more...)

2411.06651

Country:

North America > United States (0.28)
Atlantic Ocean (0.14)
Europe (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Smith, D. Hudson, Nisbet, Noah, Ehrett, Carl, Tica, Cristina I., Atwell, Madeline M., Weisensee, Katherine E.

Modeling human decomposition: a Bayesian approach

arXiv.org Artificial IntelligenceNov-14-2024

Environmental and individualistic variables affect the rate of human decomposition in complex ways. These effects complicate the estimation of the postmortem interval (PMI) based on observed decomposition characteristics. In this work, we develop a generative probabilistic model for decomposing human remains based on PMI and a wide range of environmental and individualistic variables. This model explicitly represents the effect of each variable, including PMI, on the appearance of each decomposition characteristic, allowing for direct interpretation of model effects and enabling the use of the model for PMI inference and optimal experimental design. In addition, the probabilistic nature of the model allows for the integration of expert knowledge in the form of prior distributions. We fit this model to a diverse set of 2,529 cases from the GeoFOR dataset. We demonstrate that the model accurately predicts 24 decomposition characteristics with an ROC AUC score of 0.85. Using Bayesian inference techniques, we invert the decomposition model to predict PMI as a function of the observed decomposition characteristics and environmental and individualistic variables, producing an R-squared measure of 71%. Finally, we demonstrate how to use the fitted model to design future experiments that maximize the expected amount of new information about the mechanisms of decomposition using the Expected Information Gain formalism.

artificial intelligence, bayesian inference, machine learning, (19 more...)

2411.09802

Country:

North America > United States > New Mexico (0.04)
North America > United States > Texas (0.04)
Europe > Netherlands (0.04)
Europe > Italy > Molise > Campobasso Province > Campobasso (0.04)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.68)

Industry: Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Khodak, Mikhail, Mackey, Lester, Chouldechova, Alexandra, Dudík, Miroslav

SureMap: Simultaneous Mean Estimation for Single-Task and Multi-Task Disaggregated Evaluation

arXiv.org Machine LearningNov-14-2024

Disaggregated evaluation -- estimation of performance of a machine learning model on different subpopulations -- is a core task when assessing performance and group-fairness of AI systems. A key challenge is that evaluation data is scarce, and subpopulations arising from intersections of attributes (e.g., race, sex, age) are often tiny. Today, it is common for multiple clients to procure the same AI model from a model developer, and the task of disaggregated evaluation is faced by each customer individually. This gives rise to what we call the multi-task disaggregated evaluation problem, wherein multiple clients seek to conduct a disaggregated evaluation of a given model in their own data setting (task). In this work we develop a disaggregated evaluation method called SureMap that has high estimation accuracy for both multi-task and single-task disaggregated evaluations of blackbox models. SureMap's efficiency gains come from (1) transforming the problem into structured simultaneous Gaussian mean estimation and (2) incorporating external data, e.g., from the AI system creator or from their other clients. Our method combines maximum a posteriori (MAP) estimation using a well-chosen prior together with cross-validation-free tuning via Stein's unbiased risk estimate (SURE). We evaluate SureMap on disaggregated evaluation tasks in multiple domains, observing significant accuracy improvements over several strong competitors.

artificial intelligence, machine learning, natural language, (20 more...)

2411.0973

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > California (0.04)
North America > Puerto Rico (0.04)
(2 more...)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
(2 more...)