Bayesian Learning
Pose2Gest: A Few-Shot Model-Free Approach Applied In South Indian Classical Dance Gesture Recognition
Raju, Kavitha, Warrier, Nandini J., Madhavan, Manu, C., Selvi, Warrier, Arun B., Kumar, Thulasi
The classical dances from India utilize a set of hand gestures known as Mudras, serving as the foundational elements of its posture vocabulary. Identifying these mudras represents a primary task in digitizing the dance performances. With Kathakali, a dance-drama, as the focus, this work addresses mudra recognition by framing it as a 24-class classification problem and proposes a novel vector-similarity-based approach leveraging pose estimation techniques. This method obviates the need for extensive training or fine-tuning, thus mitigating the issue of limited data availability common in similar AI applications. Achieving an accuracy rate of 92%, our approach demonstrates comparable or superior performance to existing model-training-based methodologies in this domain. Notably, it remains effective even with small datasets comprising just 1 or 5 samples, albeit with a slightly diminished performance. Furthermore, our system supports processing images, videos, and real-time streams, accommodating both hand-cropped and full-body images. As part of this research, we have curated and released a publicly accessible Hasta Mudra dataset, which applies to multiple South Indian art forms including Kathakali. The implementation of the proposed method is also made available as a web application.
Active Learning with Fully Bayesian Neural Networks for Discontinuous and Nonstationary Data
Active learning optimizes the exploration of large parameter spaces by strategically selecting which experiments or simulations to conduct, thus reducing resource consumption and potentially accelerating scientific discovery. A key component of this approach is a probabilistic surrogate model, typically a Gaussian Process (GP), which approximates an unknown functional relationship between control parameters and a target property. However, conventional GPs often struggle when applied to systems with discontinuities and non-stationarities, prompting the exploration of alternative models. This limitation becomes particularly relevant in physical science problems, which are often characterized by abrupt transitions between different system states and rapid changes in physical property behavior. Fully Bayesian Neural Networks (FBNNs) serve as a promising substitute, treating all neural network weights probabilistically and leveraging advanced Markov Chain Monte Carlo techniques for direct sampling from the posterior distribution. This approach enables FBNNs to provide reliable predictive distributions, crucial for making informed decisions under uncertainty in the active learning setting. Although traditionally considered too computationally expensive for 'big data' applications, many physical sciences problems involve small amounts of data in relatively low-dimensional parameter spaces. Here, we assess the suitability and performance of FBNNs with the No-U-Turn Sampler for active learning tasks in the 'small data' regime, highlighting their potential to enhance predictive accuracy and reliability on test functions relevant to problems in physical sciences.
RuleFuser: Injecting Rules in Evidential Networks for Robust Out-of-Distribution Trajectory Prediction
Patrikar, Jay, Veer, Sushant, Sharma, Apoorva, Pavone, Marco, Scherer, Sebastian
Modern neural trajectory predictors in autonomous driving are developed using imitation learning (IL) from driving logs. Although IL benefits from its ability to glean nuanced and multi-modal human driving behaviors from large datasets, the resulting predictors often struggle with out-of-distribution (OOD) scenarios and with traffic rule compliance. On the other hand, classical rule-based predictors, by design, can predict traffic rule satisfying behaviors while being robust to OOD scenarios, but these predictors fail to capture nuances in agent-to-agent interactions and human driver's intent. In this paper, we present RuleFuser, a posterior-net inspired evidential framework that combines neural predictors with classical rule-based predictors to draw on the complementary benefits of both, thereby striking a balance between performance and traffic rule compliance. The efficacy of our approach is demonstrated on the real-world nuPlan dataset where RuleFuser leverages the higher performance of the neural predictor in in-distribution (ID) scenarios and the higher safety offered by the rule-based predictor in OOD scenarios.
A Notion of Uniqueness for the Adversarial Bayes Classifier
We propose a new notion of uniqueness for the adversarial Bayes classifier in the setting of binary classification. Analyzing this concept produces a simple procedure for computing all adversarial Bayes classifiers for a well-motivated family of one dimensional data distributions. This characterization is then leveraged to show that as the perturbation radius increases, certain the regularity of adversarial Bayes classifiers improves. Various examples demonstrate that the boundary of the adversarial Bayes classifier frequently lies near the boundary of the Bayes classifier.
Model orthogonalization and Bayesian forecast mixing via Principal Component Analysis
Giuliani, Pablo, Godbey, Kyle, Kejzlar, Vojtech, Nazarewicz, Witold
One can improve predictability in the unknown domain by combining forecasts of imperfect complex computational models using a Bayesian statistical machine learning framework. In many cases, however, the models used in the mixing process are similar. In addition to contaminating the model space, the existence of such similar, or even redundant, models during the multimodeling process can result in misinterpretation of results and deterioration of predictive performance. In this work we describe a method based on the Principal Component Analysis that eliminates model redundancy. We show that by adding model orthogonalization to the proposed Bayesian Model Combination framework, one can arrive at better prediction accuracy and reach excellent uncertainty quantification performance.
TRABSA: Interpretable Sentiment Analysis of Tweets using Attention-based BiLSTM and Twitter-RoBERTa
Jahin, Md Abrar, Shovon, Md Sakib Hossain, Mridha, M. F., Islam, Md Rashedul, Watanobe, Yutaka
Sentiment analysis is crucial for understanding public opinion and consumer behavior. Existing models face challenges with linguistic diversity, generalizability, and explainability. We propose TRABSA, a hybrid framework integrating transformer-based architectures, attention mechanisms, and BiLSTM networks to address this. Leveraging RoBERTa-trained on 124M tweets, we bridge gaps in sentiment analysis benchmarks, ensuring state-of-the-art accuracy. Augmenting datasets with tweets from 32 countries and US states, we compare six word-embedding techniques and three lexicon-based labeling techniques, selecting the best for optimal sentiment analysis. TRABSA outperforms traditional ML and deep learning models with 94% accuracy and significant precision, recall, and F1-score gains. Evaluation across diverse datasets demonstrates consistent superiority and generalizability. SHAP and LIME analyses enhance interpretability, improving confidence in predictions. Our study facilitates pandemic resource management, aiding resource planning, policy formation, and vaccination tactics.
Information Cascade Prediction under Public Emergencies: A Survey
Zhang, Qi, Wang, Guang, Lin, Li, Xia, Kaiwen, Wang, Shuai
These emergencies are unexpected events that occur suddenly and result in or have the potential to result in significant casualties, property damage, ecological harm, and serious social consequences [147]. Throughout history, natural disasters (such as earthquakes, tsunamis, volcanic eruptions, storms, floods, avalanches, droughts, and wildfires) and accident disasters (including environmental disasters, traffic accidents, explosions, and gas leaks) have caused numerous fatalities, infrastructure damage, and extensive economic loss. According to the Emergencies Database (EM-DAT), between 2000 and 2023, 5,922 public emergencies occurred, leading to 480,000 casualties and 3.5 trillion in economic losses, as shown in Figure 1 [1]. Therefore, it is increasingly vital to use data, information, and various models to predict potential public emergencies that jeopardize public safety and well-being. Predicting the cascade of information in the event deduction process under public emergencies assists governments, organizations, and individuals in taking proactive measures to mitigate the impact of emergencies and minimize damage. Public emergencies are classified into different categories. The most common categories of public emergencies include (1) Natural disasters, (2) Accident disasters.
Unbiased Learning to Rank Meets Reality: Lessons from Baidu's Large-Scale Search Dataset
Hager, Philipp, Deffayet, Romain, Renders, Jean-Michel, Zoeter, Onno, de Rijke, Maarten
Unbiased learning-to-rank (ULTR) is a well-established framework for learning from user clicks, which are often biased by the ranker collecting the data. While theoretically justified and extensively tested in simulation, ULTR techniques lack empirical validation, especially on modern search engines. The Baidu-ULTR dataset released for the WSDM Cup 2023, collected from Baidu's search engine, offers a rare opportunity to assess the real-world performance of prominent ULTR techniques. Despite multiple submissions during the WSDM Cup 2023 and the subsequent NTCIR ULTRE-2 task, it remains unclear whether the observed improvements stem from applying ULTR or other learning techniques. In this work, we revisit and extend the available experiments on the Baidu-ULTR dataset. We find that standard unbiased learning-to-rank techniques robustly improve click predictions but struggle to consistently improve ranking performance, especially considering the stark differences obtained by choice of ranking loss and query-document features. Our experiments reveal that gains in click prediction do not necessarily translate to enhanced ranking performance on expert relevance annotations, implying that conclusions strongly depend on how success is measured in this benchmark.
Adversarial Consistency and the Uniqueness of the Adversarial Bayes Classifier
Minimizing an adversarial surrogate risk is a common technique for learning robust classifiers. Prior work showed that convex surrogate losses are not statistically consistent in the adversarial context-- or in other words, a minimizing sequence of the adversarial surrogate risk will not necessarily minimize the adversarial classification error. We connect the consistency of adversarial surrogate losses to properties of minimizers to the adversarial classification risk, known as adversarial Bayes classifiers. Specifically, under reasonable distributional assumptions, a convex surrogate loss is statistically consistent for adversarial learning iff the adversarial Bayes classifier satisfies a certain notion of uniqueness.
A Brief Introduction to Causal Inference in Machine Learning
This is a lecture note produced for DS-GA 3001.003 "Special Topics in DS - Causal Inference in Machine Learning" at the Center for Data Science, New York University in Spring, 2024. This course was created to target master's and PhD level students with basic background in machine learning but who were not exposed to causal inference or causal reasoning in general previously. In particular, this course focuses on introducing such students to expand their view and knowledge of machine learning to incorporate causal reasoning, as this aspect is at the core of so-called out-of-distribution generalization (or lack thereof.)