Goto

Collaborating Authors

 Overview


Generative models for crystalline materials

arXiv.org Artificial Intelligence

Understanding structure-property relationships in materials is fundamental in condensed matter physics and materials science. Over the past few years, machine learning (ML) has emerged as a powerful tool for advancing this understanding and accelerating materials discovery. Early ML approaches primarily focused on constructing and screening large material spaces to identify promising candidates for various applications. More recently, research efforts have increasingly shifted toward generating crystal structures using end-to-end generative models. This review analyzes the current state of generative modeling for crystal structure prediction and \textit{de novo} generation. It examines crystal representations, outlines the generative models used to design crystal structures, and evaluates their respective strengths and limitations. Furthermore, the review highlights experimental considerations for evaluating generated structures and provides recommendations for suitable existing software tools. Emerging topics, such as modeling disorder and defects, integration in advanced characterization, and incorporating synthetic feasibility constraints, are explored. Ultimately, this work aims to inform both experimental scientists looking to adapt suitable ML models to their specific circumstances and ML specialists seeking to understand the unique challenges related to inverse materials design and discovery.


Federated Learning Survey: A Multi-Level Taxonomy of Aggregation Techniques, Experimental Insights, and Future Frontiers

arXiv.org Artificial Intelligence

The integration of IoT and AI has unlocked innovation across industries, but growing privacy concerns and data isolation hinder progress. Traditional centralized ML struggles to overcome these challenges, which has led to the rise of Federated Learning (FL), a decentralized paradigm that enables collaborative model training without sharing local raw data. FL ensures data privacy, reduces communication overhead, and supports scalability, yet its heterogeneity adds complexity compared to centralized approaches. This survey focuses on three main FL research directions: personalization, optimization, and robustness, offering a structured classification through a hybrid methodology that combines bibliometric analysis with systematic review to identify the most influential works. We examine challenges and techniques related to heterogeneity, efficiency, security, and privacy, and provide a comprehensive overview of aggregation strategies, including architectures, synchronization methods, and diverse federation objectives. To complement this, we discuss practical evaluation approaches and present experiments comparing aggregation methods under IID and non-IID data distributions. Finally, we outline promising research directions to advance FL, aiming to guide future innovation in this rapidly evolving field.


MATCH: Engineering Transparent and Controllable Conversational XAI Systems through Composable Building Blocks

arXiv.org Artificial Intelligence

While the increased integration of AI technologies into interactive systems enables them to solve an increasing number of tasks, the black-box problem of AI models continues to spread throughout the interactive system as a whole. Explainable AI (XAI) techniques can make AI models more accessible by employing post-hoc methods or transitioning to inherently interpretable models. While this makes individual AI models clearer, the overarching system architecture remains opaque. This challenge not only pertains to standard XAI techniques but also to human examination and conversational XAI approaches that need access to model internals to interpret them correctly and completely. To this end, we propose conceptually representing such interactive systems as sequences of structural building blocks. These include the AI models themselves, as well as control mechanisms grounded in literature. The structural building blocks can then be explained through complementary explanatory building blocks, such as established XAI techniques like LIME and SHAP. The flow and APIs of the structural building blocks form an unambiguous overview of the underlying system, serving as a communication basis for both human and automated agents, thus aligning human and machine interpretability of the embedded AI models. In this paper, we present our flow-based approach and a selection of building blocks as MATCH: a framework for engineering Multi-Agent Transparent and Controllable Human-centered systems. This research contributes to the field of (conversational) XAI by facilitating the integration of interpretability into existing interactive systems.


On the Complexity of the Grounded Semantics for Infinite Argumentation Frameworks

arXiv.org Artificial Intelligence

Over the past three decades, formal argumentation has established itself as a prominent research area within Artificial Intelligence, owing to its versatility in addressing various reasoning tasks. These include nonmonotonic reasoning, multi-agent systems, rule-based systems, and the analysis of debates or dialogues. Formal argumentation provides a unifying framework for representing diverse reasoning approaches, ranging from highly skeptical to more permissive forms of inference (for a comprehensive introduction to this area, see the handbook [4]). At the heart of formal argumentation lies Dung's abstract argumentation frameworks (AFs) [15], which are modeled as directed graphs, where nodes correspond to arguments, and directed edges represent the attack relations between them. AFs serve as a common foundational core across various reasoning systems in formal argumentation, with many extensions and refinements, e.g.


RELiQ: Scalable Entanglement Routing via Reinforcement Learning in Quantum Networks

arXiv.org Artificial Intelligence

Quantum networks are becoming increasingly important because of advancements in quantum computing and quantum sensing, such as recent developments in distributed quantum computing and federated quantum machine learning. Routing entanglement in quantum networks poses several fundamental as well as technical challenges, including the high dynamicity of quantum network links and the probabilistic nature of quantum operations. Consequently, designing hand-crafted heuristics is difficult and often leads to suboptimal performance, especially if global network topology information is unavailable. In this paper, we propose RELiQ, a reinforcement learning-based approach to entanglement routing that only relies on local information and iterative message exchange. Utilizing a graph neural network, RELiQ learns graph representations and avoids overfitting to specific network topologies - a prevalent issue for learning-based approaches. Our approach, trained on random graphs, consistently outperforms existing local information heuristics and learning-based approaches when applied to random and real-world topologies. When compared to global information heuristics, our method achieves similar or superior performance because of its rapid response to topology changes.


DeXposure: A Dataset and Benchmarks for Inter-protocol Credit Exposure in Decentralized Financial Networks

arXiv.org Artificial Intelligence

A new measure, value-linked credit exposure between protocols, is defined as the inferred financial dependency relationships derived from changes in Total Value Locked (TVL). We develop a token-to-protocol model using DefiLlama metadata to infer inter-protocol credit exposure from the token's stock dynamics, as reported by the protocols. Based on the curated dataset, we develop three benchmarks for machine learning research with financial applications: (1) graph clustering for global network measurement, tracking the structural evolution of credit exposure networks, (2) vector autoregression for sector-level credit exposure dynamics during major shocks (Terra and FTX), and (3) temporal graph neural networks for dynamic link prediction on temporal graphs. From the analysis, we observe (1) a rapid growth of network volume, (2) a trend of concentration to key protocols, (3) a decline of network density (the ratio of actual connections to possible connections), and (4) distinct shock propagation across sectors, such as lending platforms, trading exchanges, and asset management protocols. The DeXposure dataset and code have been released publicly. We envision they will help with research and practice in machine learning as well as financial risk monitoring, policy analysis, DeFi market modeling, amongst others. The dataset also contributes to machine learning research by offering benchmarks for graph clustering, vector autoregres-sion, and temporal graph analysis.


Online Dynamic Pricing of Complementary Products

arXiv.org Artificial Intelligence

Traditional pricing paradigms, once dominated by static models and rule-based heuristics, are increasingly being replaced by dynamic, data-driven approaches powered by machine learning algorithms. Despite their growing sophistication, most dynamic pricing algorithms focus on optimizing the price of each product independently, disregarding potential interactions among items. By neglecting these interdependencies in consumer demand across related goods, sellers may fail to capture the full potential of coordinated pricing strategies. In this paper, we address this problem by exploring dynamic pricing mechanisms designed explicitly for complementary products, aiming to exploit their joint demand structure to maximize overall revenue. We present an online learning algorithm considering both positive and negative interactions between products' demands. The algorithm utilizes transaction data to identify advantageous complementary relationships through an integer programming problem between different items, and then optimizes pricing strategies using data-driven and computationally efficient multi-armed bandit solutions based on heteroscedastic Gaussian processes. We validate our solution in a simulated environment, and we demonstrate that our solution improves the revenue w.r.t. a comparable learning algorithm ignoring such interactions.


C$^2$DLM: Causal Concept-Guided Diffusion Large Language Models

arXiv.org Artificial Intelligence

Autoregressive (AR) language models and Diffusion Language Models (DLMs) constitute the two principal paradigms of large language models. However, both paradigms suffer from insufficient reasoning capabilities. Human reasoning inherently relies on causal knowledge and thought, which are reflected in natural language. But in the AR paradigm, language is modeled as next token prediction (a strictly left-to-right, token-by-token order), whereas natural language itself exhibits more flexible causal structures. In the DLM paradigm, the attention mechanism is fully connected, which entirely disregards causal order. To fill this gap, we propose a \underline{\textbf{C}}ausal \underline{\textbf{C}}oncept-Guided \underline{\textbf{D}}iffusion \underline{\textbf{L}}anguage \underline{\textbf{M}}odel (C$^2$DLM). Starting from DLM's fully connected attention, C$^2$DLM first obtains a concept-level causal graph from the teacher model, and then explicitly guides attention to learn causal relationships between concepts. By focusing on causal relationships and avoiding interference from difficult subgoals involving causal inversion, C$^2$DLM improves 12\% with about 3.2 times training speedup in the COT-OrderPerturb task, and achieves an average gain of 1.31\% across six downstream reasoning tasks. More details in the repository ~\href{https://github.com/Kairong-Han/C-2-DLM}{here}.


Energy Efficient Sleep Mode Optimization in 5G mmWave Networks via Multi Agent Deep Reinforcement Learning

arXiv.org Artificial Intelligence

Dynamic sleep mode optimization (SMO) in millimeter-wave (mmWave) networks is essential for maximizing energy efficiency (EE) under stringent quality-of-service (QoS) constraints. However, existing optimization and reinforcement learning (RL)-based approaches rely on aggregated, static base station (BS) traffic models that fail to capture non-stationary traffic dynamics and suffer from prohibitively large state-action spaces, limiting their real-world deployment. To address these challenges, this paper proposes a Multi-Agent Deep Reinforcement Learning (MARL) framework employing a Double Deep Q-Network (DDQN), referred to as MARL-DDQN, for adaptive SMO in a 3D urban environment using a time-varying and community-based user equipment (UE) mobility model. Unlike conventional single-agent RL, the proposed MARL-DDQN enables scalable, distributed decision-making with minimal signaling overhead. A realistic BS power consumption model and beamforming are integrated to accurately quantify EE, while QoS is uniquely defined in terms of throughput. The proposed method adaptively learns SMO policies to maximize EE while mitigating inter-cell interference and ensuring throughput fairness. Extensive simulations demonstrate that MARL-DDQN consistently outperforms state-of-the-art SM strategies, including the All On, iterative QoS-aware load-based (IT-QoS-LB), MARL-DDPG, and MARL-PPO, achieving up to 0. 60 Mbit/Joule EE, 8. 5 Mbps 10th-percentile throughput, and satisfying QoS constraints 95 % of the time under dynamic network scenarios. I. Introduction The exponential growth in mobile data demand has necessitated increased spectrum availability and accelerated the expansion of cellular network infrastructure. To address the limitations of the sub-6 GHz spectrum, millimeter wave (mmWave) communications, operating within the 30-300 GHz band, have emerged as a key enabler in fifth-generation (5G) networks. With significantly larger bandwidth availability, mmWave technology presents a viable solution to spectrum scarcity challenges [1]. However, mmWave signals suffer from high propagation loss, atmospheric absorption, and susceptibility to blockages, which severely limit coverage and reliability. To address coverage and growing capacity demands, 5G networks rely on densification, deploying numerous low-power mmWave BSs with inter-site distances of a few hundred meters [1]. These BSs utilize large antenna arrays to enable beamforming and spatial multiplexing, often relying on hybrid analog-digital precoding to reduce hardware complexity [2]. However, the RF chain remains a major source of power consumption, particularly the Analog-to-digital converters (ADCs) and digital-to-analog converters (DACs), whose power scales with sampling rate. Due to the higher frequencies and wider bandwidths of mmWave systems, these components require significantly higher sampling rates than sub-6 GHz systems [3], resulting in substantial energy demands.


Exploring Fusion Strategies for Multimodal Vision-Language Systems

arXiv.org Artificial Intelligence

Modern machine learning models often combine multiple input streams of data to more accurately capture the information that informs their decisions. In multimodal machine learning, choosing the strategy for fusing data together requires careful consideration of the application's accuracy and latency requirements, as fusing the data at earlier or later stages in the model architecture can lead to performance changes in accuracy and latency. T o demonstrate this trade-off, we investigate different fusion strategies using a hybrid BERT and vision network framework that integrates image and text data. W e explore two different vision networks: MobileNetV2 and ViT. W e propose three models for each vision network, which fuse data at late, intermediate, and early stages in the architecture. W e evaluate the proposed models on the CMU-MOSI dataset and benchmark their latency on an NVIDIA Jetson Orin AGX. Our experimental results demonstrate that while late fusion yields the highest accuracy, early fusion offers the lowest inference latency. W e describe the three proposed model architectures and discuss the accuracy and latency trade-offs, concluding that data fusion earlier in the model architecture results in faster inference times at the cost of accuracy.