Goto

Collaborating Authors

 Bayesian Learning


Trends in Motion Prediction Toward Deployable and Generalizable Autonomy: A Revisit and Perspectives

arXiv.org Artificial Intelligence

Motion prediction, recently popularized under the term world models, refers to anticipating the future states of agents or the future evolution of a scene, which is rooted in human cognition to bridge perception and decision-making, enabling us to anticipate, adapt, and act within an ever-changing world. It lies at the core of intelligent autonomous systems, such as robotics and self-driving cars, to safely operate in dynamic and human-robot-mixed environments, and also informs broader time-series challenges. With advances in methods, representations, and datasets, the field has seen rapid progress, reflected in rapidly updated benchmark performance. However, when state-of-the-art methods are deployed in the real world, they are often found to struggle to generalize to open-world settings and fall short of deployment standards. This reveals a gap between reality and benchmarks, which are often idealized or ill-posed, and fail to capture real-world complexity. To address the pressing need for problem settings that better reflect real-world challenges and guide future research, this paper focuses on revisiting the generalization and applicability of motion prediction models, with an emphasis on robotics, autonomous driving, and human motion applications. We first provide a comprehensive taxonomy of motion prediction methods, covering representations, modelling methods, application domains, and evaluation protocols. We then revisit two fundamental problems: 1) how to push motion prediction models to be deployable to realistic deployment standards, where motion prediction does not act in a vacuum, but functions as one module of closed-loop autonomy stacks - it takes input from the localization and perception, and informs downstream planning and control.


WATSON-Net: Vetting, Validation, and Analysis of Transits from Space Observations with Neural Networks

arXiv.org Artificial Intelligence

Context. As the number of detected transiting exoplanet candidates continues to grow, the need for robust and scalable automated tools to prioritize or validate them has become increasingly critical. Among the most promising solutions, deep learning models offer the ability to interpret complex diagnostic metrics traditionally used in the vetting process. Aims. In this work, we present WATSON-Net, a new open-source neural network classifier and data preparation package designed to compete with current state-of-the-art tools for vetting and validation of transiting exoplanet signals from space-based missions. Methods. Trained on Kepler Q1-Q17 DR25 data using 10-fold cross-validation, WATSON-Net produces ten independent models, each evaluated on dedicated validation and test sets. The ten models are calibrated and prepared to be extensible for TESS data by standardizing the input pipeline, allowing for performance assessment across different space missions. Results. For Kepler targets, WATSON-Net achieves a recall-at-precision of 0.99 (R@P0.99) of 0.903, ranking second, with only the ExoMiner network performing better (R@P0.99 = 0.936). For TESS signals, WATSON-Net emerges as the best-performing non-fine-tuned machine learning classifier, achieving a precision of 0.93 and a recall of 0.76 on a test set comprising confirmed planets and false positives. Both the model and its data preparation tools are publicly available in the dearwatson Python package, fully open-source and integrated into the vetting engine of the SHERLOCK pipeline.


Back to the Future: The Role of Past and Future Context Predictability in Incremental Language Production

arXiv.org Artificial Intelligence

Contextual predictability shapes both the form and choice of words in online language production. The effects of the predictability of a word given its previous context are generally well-understood in both production and comprehension, but studies of naturalistic production have also revealed a poorly-understood backward predictability effect of a word given its future context, which may be related to future planning. Here, in two studies of naturalistic speech corpora, we investigate backward predictability effects using improved measures and more powerful language models, introducing a new principled and conceptually motivated information-theoretic predictability measure that integrates predictability from both the future and the past context. Our first study revisits classic predictability effects on word duration. Our second study investigates substitution errors within a generative framework that independently models the effects of lexical, contextual, and communicative factors on word choice, while predicting the actual words that surface as speech errors. We find that our proposed conceptually-motivated alternative to backward predictability yields qualitatively similar effects across both studies. Through a fine-grained analysis of substitution errors, we further show that different kinds of errors are suggestive of how speakers prioritize form, meaning, and context-based information during lexical planning. Together, these findings illuminate the functional roles of past and future context in how speakers encode and choose words, offering a bridge between contextual predictability effects and the mechanisms of sentence planning.


Veli: Unsupervised Method and Unified Benchmark for Low-Cost Air Quality Sensor Correction

arXiv.org Artificial Intelligence

Urban air pollution is a major health crisis causing millions of premature deaths annually, underscoring the urgent need for accurate and scalable monitoring of air quality (AQ). While low-cost sensors (LCS) offer a scalable alternative to expensive reference-grade stations, their readings are affected by drift, calibration errors, and environmental interference. To address these challenges, we introduce Veli (Reference-free Variational Estimation via Latent Inference), an unsupervised Bayesian model that leverages variational inference to correct LCS readings without requiring co-location with reference stations, eliminating a major deployment barrier. Specifically, Veli constructs a disentangled representation of the LCS readings, effectively separating the true pollutant reading from the sensor noise. To build our model and address the lack of standardized benchmarks in AQ monitoring, we also introduce the Air Quality Sensor Data Repository (AQ-SDR). AQ-SDR is the largest AQ sensor benchmark to date, with readings from 23,737 LCS and reference stations across multiple regions. Veli demonstrates strong generalization across both in-distribution and out-of-distribution settings, effectively handling sensor drift and erratic sensor behavior. Code for model and dataset will be made public when this paper is published.


Bridging Synthetic and Real-World Domains: A Human-in-the-Loop Weakly-Supervised Framework for Industrial Toxic Emission Segmentation

arXiv.org Artificial Intelligence

Industrial smoke segmentation is critical for air-quality monitoring and environmental protection but is often hampered by the high cost and scarcity of pixel-level annotations in real-world settings. We introduce CEDANet, a human-in-the-loop, class-aware domain adaptation framework that uniquely integrates weak, citizen-provided video-level labels with adversarial feature alignment. Specifically, we refine pseudo-labels generated by a source-trained segmentation model using citizen votes, and employ class-specific domain discriminators to transfer rich source-domain representations to the industrial domain. Comprehensive experiments on SMOKE5K and custom IJmond datasets demonstrate that CEDANet achieves an F1-score of 0.414 and a smoke-class IoU of 0.261 with citizen feedback, vastly outperforming the baseline model, which scored 0.083 and 0.043 respectively. This represents a five-fold increase in F1-score and a six-fold increase in smoke-class IoU. Notably, CEDANet with citizen-constrained pseudo-labels achieves performance comparable to the same architecture trained on limited 100 fully annotated images with F1-score of 0.418 and IoU of 0.264, demonstrating its ability to reach small-sampled fully supervised-level accuracy without target-domain annotations. Our research validates the scalability and cost-efficiency of combining citizen science with weakly supervised domain adaptation, offering a practical solution for complex, data-scarce environmental monitoring applications.


Misaligned by Design: Incentive Failures in Machine Learning

arXiv.org Artificial Intelligence

The cost of error in many high-stakes settings is asymmetric: misdiagnosing pneumonia when absent is an inconvenience, but failing to detect it when present can be life-threatening. Because of this, artificial intelligence (AI) models used to assist such decisions are frequently trained with asymmetric loss functions that incorporate human decision-makers' trade-offs between false positives and false negatives. In two focal applications, we show that this standard alignment practice can backfire. In both cases, it would be better to train the machine learning model with a loss function that ignores the human's objective and then adjust predictions ex post according to that objective. We rationalize this result using an economic model of incentive design with endogenous information acquisition. The key insight from our theoretical framework is that machine classifiers perform not one but two incentivized tasks: choosing how to classify and learning how to classify. We show that while the adjustments engineers use correctly incentivize choosing, they can simultaneously reduce the incentives to learn. Our formal treatment of the problem reveals that methods embraced for their intuitive appeal can in fact misalign human and machine objectives in predictable ways.


MULTI-LF: A Continuous Learning Framework for Real-Time Malicious Traffic Detection in Multi-Environment Networks

arXiv.org Artificial Intelligence

Multi-environment (M-En) networks integrate diverse traffic sources, including Internet of Things (IoT) and traditional computing systems, creating complex and evolving conditions for malicious traffic detection. Existing machine learning (ML)-based approaches, typically trained on static single-domain datasets, often fail to generalize across heterogeneous network environments. To address this gap, we develop a realistic Docker-NS3-based testbed that emulates both IoT and traditional traffic conditions, enabling the generation and capture of live, labeled network flows. The resulting M-En Dataset combines this traffic with curated public PCAP traces to provide comprehensive coverage of benign and malicious behaviors. Building on this foundation, we propose Multi-LF, a real-time continuous learning framework that combines a lightweight model (M1) for rapid detection with a deeper model (M2) for high-confidence refinement and adaptation. A confidence-based coordination mechanism enhances efficiency without compromising accuracy, while weight interpolation mitigates catastrophic forgetting during continuous updates. Features extracted at 1-second intervals capture fine-grained temporal patterns, enabling early recognition of evolving attack behaviors. Implemented and evaluated within the Docker-NS3 testbed on live traffic, Multi-LF achieves an accuracy of 0.999 while requiring human intervention for only 0.0026 percent of packets, demonstrating its effectiveness and practicality for real-time malicious traffic detection in heterogeneous network environments.


Adversarial Bias: Data Poisoning Attacks on Fairness

arXiv.org Artificial Intelligence

Abstract--With the growing adoption of AI and machine learning systems in real-world applications, ensuring their fairness has become increasingly critical. There is relatively little research on fairness vulnerability, i.e., how an AI system's fairness can be intentionally compromised. In this work, we first provide a theoretical analysis demonstrating that a simple adversarial poisoning strategy is sufficient to induce maximally unfair behavior in naive Bayes classifiers. Our key idea is to strategically inject a small fraction of carefully crafted adversarial data points into the training set, biasing the model's decision boundary to disproportionately affect a protected group while preserving generalizable performance. T o illustrate the practical effectiveness of our method, we conduct experiments across several benchmark datasets and models. We find that our attack significantly outperforms existing methods in degrading fairness metrics across multiple models and datasets, often achieving substantially higher levels of unfairness with a comparable or only slightly worse impact on accuracy. Notably, our method proves effective on a wide range of models, in contrast to prior work, demonstrating a robust and potent approach to compromising the fairness of machine learning systems. Ensuring fairness in AI and machine learning systems is a critical concern alongside their growing real-world deployment.


How Artificial Intelligence Leads to Knowledge Why: An Inquiry Inspired by Aristotle's Posterior Analytics

arXiv.org Artificial Intelligence

Bayesian networks and causal models provide frameworks for handling queries about external interventions and counterfactuals, enabling tasks that go beyond what probability distributions alone can address. While these formalisms are often informally described as capturing causal knowledge, there is a lack of a formal theory characterizing the type of knowledge required to predict the effects of external interventions. This work introduces the theoretical framework of causal systems to clarify Aristotle's distinction between knowledge that and knowledge why within artificial intelligence. By interpreting existing artificial intelligence technologies as causal systems, it investigates the corresponding types of knowledge. Furthermore, it argues that predicting the effects of external interventions is feasible only with knowledge why, providing a more precise understanding of the knowledge necessary for such tasks.


Robust Experimental Design via Generalised Bayesian Inference

arXiv.org Machine Learning

Bayesian optimal experimental design is a principled framework for conducting experiments that leverages Bayesian inference to quantify how much information one can expect to gain from selecting a certain design. However, accurate Bayesian inference relies on the assumption that one's statistical model of the data-generating process is correctly specified. If this assumption is violated, Bayesian methods can lead to poor inference and estimates of information gain. Generalised Bayesian (or Gibbs) inference is a more robust probabilistic inference framework that replaces the likelihood in the Bayesian update by a suitable loss function. In this work, we present Generalised Bayesian Optimal Experimental Design (GBOED), an extension of Gibbs inference to the experimental design setting which achieves robustness in both design and inference. Using an extended information-theoretic framework, we derive a new acquisition function, the Gibbs expected information gain (Gibbs EIG). Our empirical results demonstrate that GBOED enhances robustness to outliers and incorrect assumptions about the outcome noise distribution.