Goto

Collaborating Authors

 Overview


Epistemic Deep Learning: Enabling Machine Learning Models to Know When They Do Not Know

arXiv.org Artificial Intelligence

Machine learning has achieved remarkable successes, yet its deployment in safety-critical domains remains hindered by an inherent inability to manage uncertainty, resulting in overconfident and unreliable predictions when models encounter out-of-distribution data, adversarial perturbations, or naturally fluctuating environments. This thesis, titled Epistemic Deep Learning: Enabling Machine Learning Models to 'Know When They Do Not Know', addresses these critical challenges by advancing the paradigm of Epistemic Artificial Intelligence, which explicitly models and quantifies epistemic uncertainty: the uncertainty arising from limited, biased, or incomplete training data, as opposed to the irreducible randomness of aleatoric uncertainty, thereby empowering models to acknowledge their limitations and refrain from overconfident decisions when uncertainty is high. Central to this work is the development of the Random-Set Neural Network (RS-NN), a novel methodology that leverages random set theory to predict belief functions over sets of classes, capturing the extent of epistemic uncertainty through the width of associated credal sets, applications of RS-NN, including its adaptation to Large Language Models (LLMs) and its deployment in weather classification for autonomous racing. In addition, the thesis proposes a unified evaluation framework for uncertainty-aware classifiers. Extensive experiments validate that integrating epistemic awareness into deep learning not only mitigates the risks associated with overconfident predictions but also lays the foundation for a paradigm shift in artificial intelligence, where the ability to 'know when it does not know' becomes a hallmark of robust and dependable systems. The title encapsulates the core philosophy of this work, emphasizing that true intelligence involves recognizing and managing the limits of one's own knowledge.


RotaTouille: Rotation Equivariant Deep Learning for Contours

arXiv.org Artificial Intelligence

Contours or closed planar curves are common in many domains. For example, they appear as object boundaries in computer vision, isolines in meteorology, and the orbits of rotating machinery. In many cases when learning from contour data, planar rotations of the input will result in correspondingly rotated outputs. It is therefore desirable that deep learning models be rotationally equivariant. In addition, contours are typically represented as an ordered sequence of edge points, where the choice of starting point is arbitrary. It is therefore also desirable for deep learning methods to be equivariant under cyclic shifts. We present RotaTouille, a deep learning framework for learning from contour data that achieves both rotation and cyclic shift equivariance through complex-valued circular convolution. We further introduce and characterize equivariant non-linearities, coarsening layers, and global pooling layers to obtain invariant representations for downstream tasks. Finally, we demonstrate the effectiveness of RotaTouille through experiments in shape classification, reconstruction, and contour regression.


Rethinking Inference Placement for Deep Learning across Edge and Cloud Platforms: A Multi-Objective Optimization Perspective and Future Directions

arXiv.org Artificial Intelligence

Edge intelligent applications like VR/AR and language model based chatbots have become widespread with the rapid expansion of IoT and mobile devices. However, constrained edge devices often cannot serve the increasingly large and complex deep learning (DL) models. To mitigate these challenges, researchers have proposed optimizing and offloading partitions of DL models among user devices, edge servers, and the cloud. In this setting, users can take advantage of different services to support their intelligent applications. For example, edge resources offer low response latency. In contrast, cloud platforms provide low monetary cost computation resources for computation-intensive workloads. However, communication between DL model partitions can introduce transmission bottlenecks and pose risks of data leakage. Recent research aims to balance accuracy, computation delay, transmission delay, and privacy concerns. They address these issues with model compression, model distillation, transmission compression, and model architecture adaptations, including internal classifiers. This survey contextualizes the state-of-the-art model offloading methods and model adaptation techniques by studying their implication to a multi-objective optimization comprising inference latency, data privacy, and resource monetary cost.


A Review of End-to-End Precipitation Prediction Using Remote Sensing Data: from Divination to Machine Learning

arXiv.org Artificial Intelligence

Precipitation prediction has undergone a profound transformation -- from early symbolic and empirical methods rooted in divination and observation, to modern technologies based on atmospheric physics and artificial intelligence. This review traces the historical and technological evolution of precipitation forecasting, presenting a survey about end-to-end precipitation prediction technologies that spans ancient practices, the foundations of meteorological science, the rise of numerical weather prediction (NWP), and the emergence of machine learning (ML) and deep learning (DL) models. We first explore traditional and indigenous forecasting methods, then describe the development of physical modeling and statistical frameworks that underpin contemporary operational forecasting. Particular emphasis is placed on recent advances in neural network-based approaches, including automated deep learning, interpretability-driven design, and hybrid physical-data models. By compositing research across multiple eras and paradigms, this review not only depicts the history of end-to-end precipitation prediction but also outlines future directions in next generation forecasting systems.


Semantic Surgery: Zero-Shot Concept Erasure in Diffusion Models

arXiv.org Artificial Intelligence

Concept erasure in text-to-image diffusion models is crucial for mitigating harmful content, yet existing methods often compromise generative quality. We introduce Semantic Surgery, a novel training-free, zero-shot framework for concept erasure that operates directly on text embeddings before the diffusion process. It dynamically estimates the presence of target concepts in a prompt and performs a calibrated vector subtraction to neutralize their influence at the source, enhancing both erasure completeness and locality. The framework includes a Co-Occurrence Encoding module for robust multi-concept erasure and a visual feedback loop to address latent concept persistence. As a training-free method, Semantic Surgery adapts dynamically to each prompt, ensuring precise interventions. Extensive experiments on object, explicit content, artistic style, and multi-celebrity erasure tasks show our method significantly outperforms state-of-the-art approaches. We achieve superior completeness and robustness while preserving locality and image quality (e.g., 93.58 H-score in object erasure, reducing explicit content to just 1 instance, and 8.09 H_a in style erasure with no quality degradation). This robustness also allows our framework to function as a built-in threat detection system, offering a practical solution for safer text-to-image generation.


Critical Insights into Leading Conversational AI Models

arXiv.org Artificial Intelligence

Big Language Models (LLMs) are changing the way businesses use software, the way people live their lives and the way industries work. Companies like Google, High-Flyer, Anthropic, OpenAI and Meta are making better LLMs. So, it's crucial to look at how each model is different in terms of performance, moral behaviour and usability, as these differences are based on the different ideas that built them. This study compares five top LLMs: Google's Gemini, High-Flyer's DeepSeek, Anthropic's Claude, OpenAI's GPT models and Meta's LLaMA. It performs this by analysing three important factors: Performance and Accuracy, Ethics and Bias Mitigation and Usability and Integration. It was found that Claude has good moral reasoning, Gemini is better at multimodal capabilities and has strong ethical frameworks. DeepSeek is great at reasoning based on facts, LLaMA is good for open applications and ChatGPT delivers balanced performance with a focus on usage. It was concluded that these models are different in terms of how well they work, how easy they are to use and how they treat people ethically, making it a point that each model should be utilised by the user in a way that makes the most of its strengths.


A roadmap for curvature-based geometric data analysis and learning

arXiv.org Artificial Intelligence

Geometric data analysis and learning has emerged as a distinct and rapidly developing research area, increasingly recognized for its effectiveness across diverse applications. At the heart of this field lies curvature, a powerful and interpretable concept that captures intrinsic geometric structure and underpins numerous tasks, from community detection to geometric deep learning. A wide range of discrete curvature models have been proposed for various data representations, including graphs, simplicial complexes, cubical complexes, and point clouds sampled from manifolds. These models not only provide efficient characterizations of data geometry but also constitute essential components in geometric learning frameworks. In this paper, we present the first comprehensive review of existing discrete curvature models, covering their mathematical foundations, computational formulations, and practical applications in data analysis and learning. In particular, we discuss discrete curvature from both Riemannian and metric geometry perspectives and propose a systematic pipeline for curvature-driven data analysis. We further examine the corresponding computational algorithms across different data representations, offering detailed comparisons and insights. Finally, we review state-of-the-art applications of curvature in both supervised and unsupervised learning. This survey provides a conceptual and practical roadmap for researchers to gain a better understanding of discrete curvature as a fundamental tool for geometric understanding and learning.


A short methodological review on social robot navigation benchmarking

arXiv.org Artificial Intelligence

Social Robot Navigation is the skill that allows robots to move efficiently in human-populated environments while ensuring safety, comfort, and trust. Unlike other areas of research, the scientific community has not yet achieved an agreement on how Social Robot Navigation should be benchmarked. This is notably important, as the lack of a de facto standard to benchmark Social Robot Navigation can hinder the progress of the field and may lead to contradicting conclusions. Motivated by this gap, we contribute with a short review focused exclusively on benchmarking trends in the period from January 2020 to July 2025. Of the 130 papers identified by our search using IEEE Xplore, we analysed the 85 papers that met the criteria of the review. This review addresses the metrics used in the literature for benchmarking purposes, the algorithms employed in such benchmarks, the use of human surveys for benchmarking, and how conclusions are drawn from the benchmarking results, when applicable.


A Novel Multi-Timescale Stability-Preserving Hierarchical Reinforcement Learning Controller Framework for Adaptive Control in High-Dimensional Dynamical Systems

arXiv.org Artificial Intelligence

Controlling high-dimensional stochastic systems, critical in robotics, autonomous vehicles, and hyperchaotic systems, faces the curse of dimensionality, lacks temporal abstraction, and often fails to ensure stochastic stability. To overcome these limitations, this study introduces the Multi-Timescale Lyapunov-Constrained Hierarchical Reinforcement Learning (MTLHRL) framework. MTLHRL integrates a hierarchical policy within a semi-Markov Decision Process (SMDP), featuring a high-level policy for strategic planning and a low-level policy for reactive control, which effectively manages complex, multi-timescale decision-making and reduces dimensionality overhead. Stability is rigorously enforced using a neural Lyapunov function optimized via Lagrangian relaxation and multi-timescale actor-critic updates, ensuring mean-square boundedness or asymptotic stability in the face of stochastic dynamics. The framework promotes efficient and reliable learning through trust-region constraints and decoupled optimization. Extensive simulations on an 8D hyperchaotic system and a 5-DOF robotic manipulator demonstrate MTLHRL's empirical superiority. It significantly outperforms baseline methods in both stability and performance, recording the lowest error indices (e.g., Integral Absolute Error (IAE): 3.912 in hyperchaotic control and IAE: 1.623 in robotics), achieving faster convergence, and exhibiting superior disturbance rejection. MTLHRL offers a theoretically grounded and practically viable solution for robust control of complex stochastic systems.


SteerX: Disentangled Steering for LLM Personalization

arXiv.org Artificial Intelligence

Large language models (LLMs) have shown remarkable success in recent years, enabling a wide range of applications, including intelligent assistants that support users' daily life and work. A critical factor in building such assistants is personalizing LLMs, as user preferences and needs vary widely. Activation steering, which directly leverages directions representing user preference in the LLM activation space to adjust its behavior, offers a cost-effective way to align the model's outputs with individual users. However, existing methods rely on all historical data to compute the steering vector, ignoring that not all content reflects true user preferences, which undermines the personalization signal. To address this, we propose SteerX, a disentangled steering method that isolates preference-driven components from preference-agnostic components. Grounded in causal inference theory, SteerX estimates token-level causal effects to identify preference-driven tokens, transforms these discrete signals into a coherent description, and then leverages them to steer personalized LLM generation. By focusing on the truly preference-driven information, SteerX produces more accurate activation steering vectors and enhances personalization. Experiments on two representative steering backbone methods across real-world datasets demonstrate that SteerX consistently enhances steering vector quality, offering a practical solution for more effective LLM personalization.