AITopics

Machine learning has achieved remarkable successes, yet its deployment in safety-critical domains remains hindered by an inherent inability to manage uncertainty, resulting in overconfident and unreliable predictions when models encounter out-of-distribution data, adversarial perturbations, or naturally fluctuating environments. This thesis, titled Epistemic Deep Learning: Enabling Machine Learning Models to 'Know When They Do Not Know', addresses these critical challenges by advancing the paradigm of Epistemic Artificial Intelligence, which explicitly models and quantifies epistemic uncertainty: the uncertainty arising from limited, biased, or incomplete training data, as opposed to the irreducible randomness of aleatoric uncertainty, thereby empowering models to acknowledge their limitations and refrain from overconfident decisions when uncertainty is high. Central to this work is the development of the Random-Set Neural Network (RS-NN), a novel methodology that leverages random set theory to predict belief functions over sets of classes, capturing the extent of epistemic uncertainty through the width of associated credal sets, applications of RS-NN, including its adaptation to Large Language Models (LLMs) and its deployment in weather classification for autonomous racing. In addition, the thesis proposes a unified evaluation framework for uncertainty-aware classifiers. Extensive experiments validate that integrating epistemic awareness into deep learning not only mitigates the risks associated with overconfident predictions but also lays the foundation for a paradigm shift in artificial intelligence, where the ability to 'know when it does not know' becomes a hallmark of robust and dependable systems. The title encapsulates the core philosophy of this work, emphasizing that true intelligence involves recognizing and managing the limits of one's own knowledge.

large language model, machine learning, second-order uncertainty representation, (20 more...)

doi: 10.24384/G4DC-0E02

2510.22261

Country:

Europe (1.00)
North America > United States > California (0.27)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Instructional Material (0.92)

Industry:

Information Technology (1.00)
Health & Medicine (1.00)
Education (0.92)
Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Gardaa, Odin Hoff, Blaser, Nello

RotaTouille: Rotation Equivariant Deep Learning for Contours

Contours or closed planar curves are common in many domains. For example, they appear as object boundaries in computer vision, isolines in meteorology, and the orbits of rotating machinery. In many cases when learning from contour data, planar rotations of the input will result in correspondingly rotated outputs. It is therefore desirable that deep learning models be rotationally equivariant. In addition, contours are typically represented as an ordered sequence of edge points, where the choice of starting point is arbitrary. It is therefore also desirable for deep learning methods to be equivariant under cyclic shifts. We present RotaTouille, a deep learning framework for learning from contour data that achieves both rotation and cyclic shift equivariance through complex-valued circular convolution. We further introduce and characterize equivariant non-linearities, coarsening layers, and global pooling layers to obtain invariant representations for downstream tasks. Finally, we demonstrate the effectiveness of RotaTouille through experiments in shape classification, reconstruction, and contour regression.

artificial intelligence, contour, machine learning, (16 more...)

2508.16359

Country: Asia (0.28)

Genre:

Research Report (0.50)
Overview (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Zhang, Zongshun, Matta, Ibrahim

Rethinking Inference Placement for Deep Learning across Edge and Cloud Platforms: A Multi-Objective Optimization Perspective and Future Directions

Edge intelligent applications like VR/AR and language model based chatbots have become widespread with the rapid expansion of IoT and mobile devices. However, constrained edge devices often cannot serve the increasingly large and complex deep learning (DL) models. To mitigate these challenges, researchers have proposed optimizing and offloading partitions of DL models among user devices, edge servers, and the cloud. In this setting, users can take advantage of different services to support their intelligent applications. For example, edge resources offer low response latency. In contrast, cloud platforms provide low monetary cost computation resources for computation-intensive workloads. However, communication between DL model partitions can introduce transmission bottlenecks and pose risks of data leakage. Recent research aims to balance accuracy, computation delay, transmission delay, and privacy concerns. They address these issues with model compression, model distillation, transmission compression, and model architecture adaptations, including internal classifiers. This survey contextualizes the state-of-the-art model offloading methods and model adaptation techniques by studying their implication to a multi-objective optimization comprising inference latency, data privacy, and resource monetary cost.

artificial intelligence, machine learning, natural language, (19 more...)

2510.22909

Country:

Asia (0.92)
Europe (0.67)
North America > United States > California (0.45)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Information Technology > Services (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Zeng, Yugong, Wu, Jonathan

A Review of End-to-End Precipitation Prediction Using Remote Sensing Data: from Divination to Machine Learning

Precipitation prediction has undergone a profound transformation -- from early symbolic and empirical methods rooted in divination and observation, to modern technologies based on atmospheric physics and artificial intelligence. This review traces the historical and technological evolution of precipitation forecasting, presenting a survey about end-to-end precipitation prediction technologies that spans ancient practices, the foundations of meteorological science, the rise of numerical weather prediction (NWP), and the emergence of machine learning (ML) and deep learning (DL) models. We first explore traditional and indigenous forecasting methods, then describe the development of physical modeling and statistical frameworks that underpin contemporary operational forecasting. Particular emphasis is placed on recent advances in neural network-based approaches, including automated deep learning, interpretability-driven design, and hybrid physical-data models. By compositing research across multiple eras and paradigms, this review not only depicts the history of end-to-end precipitation prediction but also outlines future directions in next generation forecasting systems.

artificial intelligence, machine learning, prediction, (17 more...)

2510.22855

Country:

North America > United States (1.00)
Europe (1.00)
Asia > China (1.00)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.52)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Semantic Surgery: Zero-Shot Concept Erasure in Diffusion Models

Xiong, Lexiang, Liu, Chengyu, Ye, Jingwen, Liu, Yan, Xu, Yuecong

Concept erasure in text-to-image diffusion models is crucial for mitigating harmful content, yet existing methods often compromise generative quality. We introduce Semantic Surgery, a novel training-free, zero-shot framework for concept erasure that operates directly on text embeddings before the diffusion process. It dynamically estimates the presence of target concepts in a prompt and performs a calibrated vector subtraction to neutralize their influence at the source, enhancing both erasure completeness and locality. The framework includes a Co-Occurrence Encoding module for robust multi-concept erasure and a visual feedback loop to address latent concept persistence. As a training-free method, Semantic Surgery adapts dynamically to each prompt, ensuring precise interventions. Extensive experiments on object, explicit content, artistic style, and multi-celebrity erasure tasks show our method significantly outperforms state-of-the-art approaches. We achieve superior completeness and robustness while preserving locality and image quality (e.g., 93.58 H-score in object erasure, reducing explicit content to just 1 instance, and 8.09 H_a in style erasure with no quality degradation). This robustness also allows our framework to function as a built-in threat detection system, offering a practical solution for safer text-to-image generation.

large language model, machine learning, natural language, (19 more...)

2510.22851

Country: Europe > Switzerland (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)
Research Report > Promising Solution (0.65)

Industry: Information Technology > Security & Privacy (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Critical Insights into Leading Conversational AI Models

Kohli, Urja, Singh, Aditi, Sharma, Arun

Big Language Models (LLMs) are changing the way businesses use software, the way people live their lives and the way industries work. Companies like Google, High-Flyer, Anthropic, OpenAI and Meta are making better LLMs. So, it's crucial to look at how each model is different in terms of performance, moral behaviour and usability, as these differences are based on the different ideas that built them. This study compares five top LLMs: Google's Gemini, High-Flyer's DeepSeek, Anthropic's Claude, OpenAI's GPT models and Meta's LLaMA. It performs this by analysing three important factors: Performance and Accuracy, Ethics and Bias Mitigation and Usability and Integration. It was found that Claude has good moral reasoning, Gemini is better at multimodal capabilities and has strong ethical frameworks. DeepSeek is great at reasoning based on facts, LLaMA is good for open applications and ChatGPT delivers balanced performance with a focus on usage. It was concluded that these models are different in terms of how well they work, how easy they are to use and how they treat people ethically, making it a point that each model should be utilised by the user in a way that makes the most of its strengths.

large language model, machine learning, natural language, (20 more...)

2510.22729

Country:

North America (0.28)
Asia > India (0.14)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Energy (0.95)
Health & Medicine > Therapeutic Area (0.68)
Education > Educational Technology (0.46)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.54)

Yadav, Yasharth, Xia, Kelin

A roadmap for curvature-based geometric data analysis and learning

Geometric data analysis and learning has emerged as a distinct and rapidly developing research area, increasingly recognized for its effectiveness across diverse applications. At the heart of this field lies curvature, a powerful and interpretable concept that captures intrinsic geometric structure and underpins numerous tasks, from community detection to geometric deep learning. A wide range of discrete curvature models have been proposed for various data representations, including graphs, simplicial complexes, cubical complexes, and point clouds sampled from manifolds. These models not only provide efficient characterizations of data geometry but also constitute essential components in geometric learning frameworks. In this paper, we present the first comprehensive review of existing discrete curvature models, covering their mathematical foundations, computational formulations, and practical applications in data analysis and learning. In particular, we discuss discrete curvature from both Riemannian and metric geometry perspectives and propose a systematic pipeline for curvature-driven data analysis. We further examine the corresponding computational algorithms across different data representations, offering detailed comparisons and insights. Finally, we review state-of-the-art applications of curvature in both supervised and unsupervised learning. This survey provides a conceptual and practical roadmap for researchers to gain a better understanding of discrete curvature as a fundamental tool for geometric understanding and learning.

curvature, data mining, machine learning, (19 more...)

2510.22599

Country:

North America > United States (0.28)
Asia > Middle East > Republic of Türkiye (0.28)

Genre: Overview > Innovation (0.34)

Industry:

Health & Medicine (1.00)
Education (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Chhetri, Pranup, Torrejon, Alejandro, Eslava, Sergio, Manso, Luis J.

A short methodological review on social robot navigation benchmarking

Social Robot Navigation is the skill that allows robots to move efficiently in human-populated environments while ensuring safety, comfort, and trust. Unlike other areas of research, the scientific community has not yet achieved an agreement on how Social Robot Navigation should be benchmarked. This is notably important, as the lack of a de facto standard to benchmark Social Robot Navigation can hinder the progress of the field and may lead to contradicting conclusions. Motivated by this gap, we contribute with a short review focused exclusively on benchmarking trends in the period from January 2020 to July 2025. Of the 130 papers identified by our search using IEEE Xplore, we analysed the 85 papers that met the criteria of the review. This review addresses the metrics used in the literature for benchmarking purposes, the algorithms employed in such benchmarks, the use of human surveys for benchmarking, and how conclusions are drawn from the benchmarking results, when applicable.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

2510.22448

Country:

North America > United States (0.69)
South America (0.68)
Europe > Portugal (0.46)
(2 more...)

Genre: Overview (0.87)

Industry:

Transportation (0.46)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Robots in the Home (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Khaniki, Mohammad Ali Labbaf, Taroodi, Fateme, Safizadeh, Benyamin

A Novel Multi-Timescale Stability-Preserving Hierarchical Reinforcement Learning Controller Framework for Adaptive Control in High-Dimensional Dynamical Systems

Controlling high-dimensional stochastic systems, critical in robotics, autonomous vehicles, and hyperchaotic systems, faces the curse of dimensionality, lacks temporal abstraction, and often fails to ensure stochastic stability. To overcome these limitations, this study introduces the Multi-Timescale Lyapunov-Constrained Hierarchical Reinforcement Learning (MTLHRL) framework. MTLHRL integrates a hierarchical policy within a semi-Markov Decision Process (SMDP), featuring a high-level policy for strategic planning and a low-level policy for reactive control, which effectively manages complex, multi-timescale decision-making and reduces dimensionality overhead. Stability is rigorously enforced using a neural Lyapunov function optimized via Lagrangian relaxation and multi-timescale actor-critic updates, ensuring mean-square boundedness or asymptotic stability in the face of stochastic dynamics. The framework promotes efficient and reliable learning through trust-region constraints and decoupled optimization. Extensive simulations on an 8D hyperchaotic system and a 5-DOF robotic manipulator demonstrate MTLHRL's empirical superiority. It significantly outperforms baseline methods in both stability and performance, recording the lowest error indices (e.g., Integral Absolute Error (IAE): 3.912 in hyperchaotic control and IAE: 1.623 in robotics), achieving faster convergence, and exhibiting superior disturbance rejection. MTLHRL offers a theoretically grounded and practically viable solution for robust control of complex stochastic systems.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

2510.2242

Country: Europe > United Kingdom (0.28)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Energy (1.00)
Banking & Finance > Trading (0.46)
Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
(3 more...)

SteerX: Disentangled Steering for LLM Personalization

Zhao, Xiaoyan, Yan, Ming, Qiu, Yilun, Ni, Haoting, Zhang, Yang, Feng, Fuli, Cheng, Hong, Chua, Tat-Seng

Large language models (LLMs) have shown remarkable success in recent years, enabling a wide range of applications, including intelligent assistants that support users' daily life and work. A critical factor in building such assistants is personalizing LLMs, as user preferences and needs vary widely. Activation steering, which directly leverages directions representing user preference in the LLM activation space to adjust its behavior, offers a cost-effective way to align the model's outputs with individual users. However, existing methods rely on all historical data to compute the steering vector, ignoring that not all content reflects true user preferences, which undermines the personalization signal. To address this, we propose SteerX, a disentangled steering method that isolates preference-driven components from preference-agnostic components. Grounded in causal inference theory, SteerX estimates token-level causal effects to identify preference-driven tokens, transforms these discrete signals into a coherent description, and then leverages them to steer personalized LLM generation. By focusing on the truly preference-driven information, SteerX produces more accurate activation steering vectors and enhances personalization. Experiments on two representative steering backbone methods across real-world datasets demonstrate that SteerX consistently enhances steering vector quality, offering a practical solution for more effective LLM personalization.

computational linguistic, large language model, machine learning, (16 more...)

2510.22256

Country:

Asia (0.67)
Europe > Austria (0.28)

Genre:

Research Report (0.64)
Overview (0.46)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)