Overview
Stochastic forest transition model dynamics and parameter estimation via deep learning
Kumabe, Satoshi, Song, Tianyu, Ta, Ton Viet
Forest transitions, characterized by dynamic shifts between forest, agricultural, and abandoned lands, are complex phenomena. This study developed a stochastic differential equation model to capture the intricate dynamics of these transitions. We established the existence of global positive solutions for the model and conducted numerical analyses to assess the impact of model parameters on deforestation incentives. To address the challenge of parameter estimation, we proposed a novel deep learning approach that estimates all model parameters from a single sample containing time-series observations of forest and agricultural land proportions. This innovative approach enables us to understand forest transition dynamics and deforestation trends at any future time.
Quantum Geometry of Data
Abanov, Alexander G., Candelori, Luca, Steinacker, Harold C., Wells, Martin T., Busemeyer, Jerome R., Hogan, Cameron J., Kirakosyan, Vahagn, Marzari, Nicola, Pinnamaneni, Sunil, Villani, Dario, Xu, Mengjia, Musaelian, Kharen
We demonstrate how Quantum Cognition Machine Learning (QCML) encodes data as quantum geometry. In QCML, features of the data are represented by learned Hermitian matrices, and data points are mapped to states in Hilbert space. The quantum geometry description endows the dataset with rich geometric and topological structure - including intrinsic dimension, quantum metric, and Berry curvature - derived directly from the data. QCML captures global properties of data, while avoiding the curse of dimensionality inherent in local methods. We illustrate this on a number of synthetic and real-world examples. Quantum geometric representation of QCML could advance our understanding of cognitive phenomena within the framework of quantum cognition.
Fine-Tuning Code Language Models to Detect Cross-Language Bugs
Li, Zengyang, Li, Yimeng, Huang, Binbin, Liang, Peng, Mo, Ran, Liu, Hui, Ma, Yutao
Multilingual programming, which involves using multiple programming languages (PLs) in a single project, is increasingly common due to its benefits. However, it introduces cross-language bugs (CLBs), which arise from interactions between different PLs and are difficult to detect by single-language bug detection tools. This paper investigates the potential of pre-trained code language models (CodeLMs) in CLB detection. We developed CLCFinder, a cross-language code identification tool, and constructed a CLB dataset involving three PL combinations (Python-C/C++, Java-C/C++, and Python-Java) with nine interaction types. We fine-tuned 13 CodeLMs on this dataset and evaluated their performance, analyzing the effects of dataset size, token sequence length, and code comments. Results show that all CodeLMs performed poorly before fine-tuning, but exhibited varying degrees of performance improvement after fine-tuning, with UniXcoder-base achieving the best F1 score (0.7407). Notably, small fine-tuned CodeLMs tended to performe better than large ones. CodeLMs fine-tuned on single-language bug datasets performed poorly on CLB detection, demonstrating the distinction between CLBs and single-language bugs. Additionally, increasing the fine-tuning dataset size significantly improved performance, while longer token sequences did not necessarily improve the model performance. The impact of code comments varied across models. Some fine-tuned CodeLMs' performance was improved, while others showed degraded performance.
Evaluation and Benchmarking of LLM Agents: A Survey
Mohammadi, Mahmoud, Li, Yipeng, Lo, Jane, Yip, Wendy
The rise of LLM-based agents has opened new frontiers in AI applications, yet evaluating these agents remains a complex and underdeveloped area. This survey provides an in-depth overview of the emerging field of LLM agent evaluation, introducing a two-dimensional taxonomy that organizes existing work along (1) evaluation objectives -- what to evaluate, such as agent behavior, capabilities, reliability, and safety -- and (2) evaluation process -- how to evaluate, including interaction modes, datasets and benchmarks, metric computation methods, and tooling. In addition to taxonomy, we highlight enterprise-specific challenges, such as role-based access to data, the need for reliability guarantees, dynamic and long-horizon interactions, and compliance, which are often overlooked in current research. We also identify future research directions, including holistic, more realistic, and scalable evaluation. This work aims to bring clarity to the fragmented landscape of agent evaluation and provide a framework for systematic assessment, enabling researchers and practitioners to evaluate LLM agents for real-world deployment.
Latte: Collaborative Test-Time Adaptation of Vision-Language Models in Federated Learning
Bao, Wenxuan, Deng, Ruxi, Qiu, Ruizhong, Wei, Tianxin, Tong, Hanghang, He, Jingrui
Test-time adaptation with pre-trained vision-language models has gained increasing attention for addressing distribution shifts during testing. Among these approaches, memory-based algorithms stand out due to their training-free nature and ability to leverage historical test data. However, existing test-time adaptation methods are typically designed for a single domain with abundant data. In decentralized settings such as federated learning, applying these methods individually to each client suffers from limited test data, while directly sharing a single global memory via the server prevents proper personalization to each client's unique distribution. To address this, we propose Latte, a novel framework where each client maintains a local memory to store embeddings from its own historical test data and an external memory to store class prototypes from other relevant clients. During communication, each client retrieves prototypes from similar clients under the server's coordination to expand its memory. For local adaptation, Latte utilizes both embedding similarity and uncertainty to enhance model performance. Our theoretical analysis shows that Latte effectively leverages in-distribution clients while remaining robust to out-of-distribution clients. Extensive experiments on domain adaptation and corruption benchmarks validate that Latte achieves superior performance in decentralized settings, while introducing only negligible communication and computation costs. Our code is available at https://github.com/baowenxuan/Latte .
Evo-DKD: Dual-Knowledge Decoding for Autonomous Ontology Evolution in Large Language Models
Raman, Vishal, R, Vijai Aravindh
Ontologies and knowledge graphs require continuous evolution to remain comprehensive and accurate, but manual curation is labor intensive. Large Language Models (LLMs) possess vast unstructured knowledge but struggle with maintaining structured consistency. We propose Evo-DKD, a novel dual-decoder framework for autonomous ontology evolution that combines structured ontology traversal with unstructured text reasoning. Evo-DKD introduces two parallel decoding streams within an LLM: one decoder generates candidate ontology edits (e.g., new concepts or relations) while the other produces natural-language justifications. A dynamic attention-based gating mechanism coordinates the two streams, deciding at each step how to blend structured and unstructured knowledge. Due to GPU constraints, we simulate the dual-decoder behavior using prompt-based mode control to approximate coordinated decoding in a single-stream mode. The system operates in a closed reasoning loop: proposed ontology edits are validated (via consistency checks and cross-verification with the text explanations) and then injected into the knowledge base, which in turn informs subsequent reasoning. We demonstrate Evo-DKD's effectiveness on use cases including healthcare ontology refinement, semantic search improvement, and cultural heritage timeline modeling. Experiments show that Evo-DKD outperforms baselines using structured-only or unstructured-only decoding in both precision of ontology updates and downstream task performance. We present quantitative metrics and qualitative examples, confirming the contributions of the dual-decoder design and gating router. Evo-DKD offers a new paradigm for LLM-driven knowledge base maintenance, combining the strengths of symbolic and neural reasoning for sustainable ontology evolution.
Evaluating Deep Learning Models for African Wildlife Image Classification: From DenseNet to Vision Transformers
Aliyu, Lukman Jibril, Muhammad, Umar Sani, Ismail, Bilqisu, Muhammad, Nasiru, Wakili, Almustapha A, Yimam, Seid Muhie, Muhammad, Shamsuddeen Hassan, Abdullahi, Mustapha
In response, image classification using deep learning has emerged as a promising tool for biodiversity monitoring and conservation. This paper presents a comparative study of deep learning models for automatically classifying African wildlife images, focusing on transfer learning with frozen feature extractors. Using a public dataset of four species: buffalo, elephant, rhinoceros, and zebra; we evaluate the performance of DenseNet-201, ResNet-152, EfficientNet-B4, and Vision Transformer ViT - H/14. DenseNet-201 achieved the best performance among convo-lutional networks (67% accuracy), while ViT -H/14 achieved the highest overall accuracy (99%), but with significantly higher computational cost, raising deployment concerns. Our experiments highlight the trade-offs between accuracy, resource requirements, and deployability. The best-performing CNN (DenseNet-201) was integrated into a Hugging Face Gradio Space for real-time field use, demonstrating the feasibility of deploying lightweight models in conservation settings. This work contributes to African-grounded AI research by offering practical insights into model selection, dataset preparation, and responsible deployment of deep learning tools for wildlife conservation.
Agentic Web: Weaving the Next Web with AI Agents
Yang, Yingxuan, Ma, Mulei, Huang, Yuxuan, Chai, Huacan, Gong, Chenyu, Geng, Haoran, Zhou, Yuanjian, Wen, Ying, Fang, Meng, Chen, Muhao, Gu, Shangding, Jin, Ming, Spanos, Costas, Yang, Yang, Abbeel, Pieter, Song, Dawn, Zhang, Weinan, Wang, Jun
The emergence of AI agents powered by large language models (LLMs) marks a pivotal shift toward the Agentic Web, a new phase of the internet defined by autonomous, goal-driven interactions. In this paradigm, agents interact directly with one another to plan, coordinate, and execute complex tasks on behalf of users. This transition from human-driven to machine-to-machine interaction allows intent to be delegated, relieving users from routine digital operations and enabling a more interactive, automated web experience. In this paper, we present a structured framework for understanding and building the Agentic Web. We trace its evolution from the PC and Mobile Web eras and identify the core technological foundations that support this shift. Central to our framework is a conceptual model consisting of three key dimensions: intelligence, interaction, and economics. These dimensions collectively enable the capabilities of AI agents, such as retrieval, recommendation, planning, and collaboration. We analyze the architectural and infrastructural challenges involved in creating scalable agentic systems, including communication protocols, orchestration strategies, and emerging paradigms such as the Agent Attention Economy. We conclude by discussing the potential applications, societal risks, and governance issues posed by agentic systems, and outline research directions for developing open, secure, and intelligent ecosystems shaped by both human intent and autonomous agent behavior. A continuously updated collection of relevant studies for agentic web is available at: https://github.com/SafeRL-Lab/agentic-web.
Interpretable Anomaly-Based DDoS Detection in AI-RAN with XAI and LLMs
Chatzimiltis, Sotiris, Shojafar, Mohammad, Mashhadi, Mahdi Boloursaz, Tafazolli, Rahim
Next generation Radio Access Networks (RANs) introduce programmability, intelligence, and near real-time control through intelligent controllers, enabling enhanced security within the RAN and across broader 5G/6G infrastructures. This paper presents a comprehensive survey highlighting opportunities, challenges, and research gaps for Large Language Models (LLMs)-assisted explainable (XAI) intrusion detection (IDS) for secure future RAN environments. Motivated by this, we propose an LLM interpretable anomaly-based detection system for distributed denial-of-service (DDoS) attacks using multivariate time series key performance measures (KPMs), extracted from E2 nodes, within the Near Real-Time RAN Intelligent Controller (Near-RT RIC). An LSTM-based model is trained to identify malicious User Equipment (UE) behavior based on these KPMs. To enhance transparency, we apply post-hoc local explainability methods such as LIME and SHAP to interpret individual predictions. Furthermore, LLMs are employed to convert technical explanations into natural-language insights accessible to non-expert users. Experimental results on real 5G network KPMs demonstrate that our framework achieves high detection accuracy (F1-score > 0.96) while delivering actionable and interpretable outputs.
Handling Out-of-Distribution Data: A Survey
Tamang, Lakpa, Bouadjenek, Mohamed Reda, Dazeley, Richard, Aryal, Sunil
In the field of Machine Learning (ML) and data-driven applications, one of the significant challenge is the change in data distribution between the training and deployment stages, commonly known as distribution shift. This paper outlines different mechanisms for handling two main types of distribution shifts: (i) Covariate shift: where the value of features or covariates change between train and test data, and (ii) Concept/Semantic-shift: where model experiences shift in the concept learned during training due to emergence of novel classes in the test phase. We sum up our contributions in three folds. First, we formalize distribution shifts, recite on how the conventional method fails to handle them adequately and urge for a model that can simultaneously perform better in all types of distribution shifts. Second, we discuss why handling distribution shifts is important and provide an extensive review of the methods and techniques that have been developed to detect, measure, and mitigate the effects of these shifts. Third, we discuss the current state of distribution shift handling mechanisms and propose future research directions in this area. Overall, we provide a retrospective synopsis of the literature in the distribution shift, focusing on OOD data that had been overlooked in the existing surveys.