Overview
Robust Federated Learning in Unreliable Wireless Networks: A Client Selection Approach
Wang, Yanmeng, Ji, Wenkai, Zhou, Jian, Xiao, Fu, Chang, Tsung-Hui
Federated learning (FL) has emerged as a promising distributed learning paradigm for training deep neural networks (DNNs) at the wireless edge, but its performance can be severely hindered by unreliable wireless transmission and inherent data heterogeneity among clients. Existing solutions primarily address these challenges by incorporating wireless resource optimization strategies, often focusing on uplink resource allocation across clients under the assumption of homogeneous client-server network standards. However, these approaches overlooked the fact that mobile clients may connect to the server via diverse network standards (e.g., 4G, 5G, Wi-Fi) with customized configurations, limiting the flexibility of server-side modifications and restricting applicability in real-world commercial networks. This paper presents a novel theoretical analysis about how transmission failures in unreliable networks distort the effective label distributions of local samples, causing deviations from the global data distribution and introducing convergence bias in FL. Our analysis reveals that a carefully designed client selection strategy can mitigate biases induced by network unreliability and data heterogeneity . Motivated by this insight, we propose FedCote, a client selection approach that optimizes client selection probabilities without relying on wireless resource scheduling. Experimental results demonstrate the robustness of FedCote in DNN-based classification tasks under unreliable networks with frequent transmission failures. With rapid advancements in mobile communications and artificial intelligence (AI), edge AI, which leverages locally generated data to train deep neural networks (DNNs) at the wireless edge, has gained significant attention from both academia and industry [1], [2], [3], [4]. A prominent approach in this domain is federated learning (FL), where an edge server coordinates mobile clients in collaboratively training a shared DNN model while ensuring client privacy [5], [6], [7]. However, FL faces a critical challenge due to ubiquitous data heterogeneity across clients, where training data are distributed in a non-i.i.d. and unbalanced manner. If not addressed, data heterogeneity can severely degrade FL performance [8], [9], [10], [11], [12]. Numerous FL algorithms have been proposed to mitigate this issue. For example, FedProx [13] introduced a regularization term in the local objective function to control model divergence, while SCAFFOLD [14] employed control variates to correct local model drift. HFMDS [15] learned essential class-relevant features of real samples to generate an auxiliary synthetic dataset, which was shared among clients for local training, helping to alleviate data heterogeneity .
Inverse Materials Design by Large Language Model-Assisted Generative Framework
Hao, Yun, Fan, Che, Ye, Beilin, Lu, Wenhao, Lu, Zhen, Zhao, Peilin, Gao, Zhifeng, Wu, Qingyao, Liu, Yanhui, Wen, Tongqi
These authors contributed equally: Y un Hao, Che Fan. Here, we introduce AlloyGAN, a closed-loop framework that integrates Large Language Model (LLM)-assisted text mining with Conditional Generative Adversarial Networks (CGANs) to enhance data diversity and improve inverse design. For metallic glasses, the framework predicts thermodynamic properties with discrepancies of less than 8% from experiments, demonstrating its robustness. By bridging generative AI with domain knowledge and validation workflows, AlloyGAN offers a scalable approach to accelerate the discovery of materials with tailored properties, paving the way for broader applications in materials science. Materials design typically involves two fundamental problems: forward and inverse problems. The forward problem focuses on understanding the relationship between composition, processing conditions, and material properties. This understanding enables researchers to optimize alloy compositions and processing conditions to achieve enhanced performance. Conversely, the inverse problem is more prevalent in material design and poses the question: "Given the desired material properties, what composition and processing conditions are required to achieve them?" The inverse problem is particularly challenging for multi-component materials due to the vast composition space and complex interactions among components. Traditional "trial-and-error" experimental approaches are often prohibitively time-consuming and cost-ineffective [1] for such problems. Addressing these challenges thus requires innovative approaches to efficiently navigate the composition space and identify optimal solutions for materials design.
Recurrent Neural Networks for Dynamic VWAP Execution: Adaptive Trading Strategies with Temporal Kolmogorov-Arnold Networks
The execution of Volume Weighted Average Price (VWAP) orders remains a critical challenge in modern financial markets, particularly as trading volumes and market complexity continue to increase. In my previous work arXiv:2502.13722, I introduced a novel deep learning approach that demonstrated significant improvements over traditional VWAP execution methods by directly optimizing the execution problem rather than relying on volume curve predictions. However, that model was static because it employed the fully linear approach described in arXiv:2410.21448, which is not designed for dynamic adjustment. This paper extends that foundation by developing a dynamic neural VWAP framework that adapts to evolving market conditions in real time. We introduce two key innovations: first, the integration of recurrent neural networks to capture complex temporal dependencies in market dynamics, and second, a sophisticated dynamic adjustment mechanism that continuously optimizes execution decisions based on market feedback. The empirical analysis, conducted across five major cryptocurrency markets, demonstrates that this dynamic approach achieves substantial improvements over both traditional methods and our previous static implementation, with execution performance gains of 10 to 15% in liquid markets and consistent outperformance across varying conditions. These results suggest that adaptive neural architectures can effectively address the challenges of modern VWAP execution while maintaining computational efficiency suitable for practical deployment.
Medical Hallucinations in Foundation Models and Their Impact on Healthcare
Kim, Yubin, Jeong, Hyewon, Chen, Shan, Li, Shuyue Stella, Lu, Mingyu, Alhamoud, Kumail, Mun, Jimin, Grau, Cristina, Jung, Minseok, Gameiro, Rodrigo, Fan, Lizhou, Park, Eugene, Lin, Tristan, Yoon, Joonsik, Yoon, Wonjin, Sap, Maarten, Tsvetkov, Yulia, Liang, Paul, Xu, Xuhai, Liu, Xin, McDuff, Daniel, Lee, Hyeonhoon, Park, Hae Won, Tulebaev, Samir, Breazeal, Cynthia
Foundation Models that are capable of processing and generating multi-modal data have transformed AI's role in medicine. However, a key limitation of their reliability is hallucination, where inaccurate or fabricated information can impact clinical decisions and patient safety. We define medical hallucination as any instance in which a model generates misleading medical content. This paper examines the unique characteristics, causes, and implications of medical hallucinations, with a particular focus on how these errors manifest themselves in real-world clinical scenarios. Our contributions include (1) a taxonomy for understanding and addressing medical hallucinations, (2) benchmarking models using medical hallucination dataset and physician-annotated LLM responses to real medical cases, providing direct insight into the clinical impact of hallucinations, and (3) a multi-national clinician survey on their experiences with medical hallucinations. Our results reveal that inference techniques such as Chain-of-Thought (CoT) and Search Augmented Generation can effectively reduce hallucination rates. However, despite these improvements, non-trivial levels of hallucination persist. These findings underscore the ethical and practical imperative for robust detection and mitigation strategies, establishing a foundation for regulatory policies that prioritize patient safety and maintain clinical integrity as AI becomes more integrated into healthcare. The feedback from clinicians highlights the urgent need for not only technical advances but also for clearer ethical and regulatory guidelines to ensure patient safety. A repository organizing the paper resources, summaries, and additional information is available at https://github.com/mitmedialab/medical hallucination.
Synthetic Categorical Restructuring large Or How AIs Gradually Extract Efficient Regularities from Their Experience of the World
Pichat, Michael, Pogrund, William, Pichat, Paloma, Gasparian, Armanouche, Demarchi, Samuel, Corbet, Martin, Georgeon, Alois, Dasilva, Theo, Veillet-Guillem, Michael
How do language models segment their internal experience of the world of words to progressively learn to interact with it more efficiently? This study in the neuropsychology of artificial intelligence investigates the phenomenon of synthetic categorical restructuring, a process through which each successive perceptron neural layer abstracts and combines relevant categorical sub-dimensions from the thought categories of its previous layer. This process shapes new, even more efficient categories for analyzing and processing the synthetic system's own experience of the linguistic external world to which it is exposed. Our genetic neuron viewer, associated with this study, allows visualization of the synthetic categorical restructuring phenomenon occurring during the transition from perceptron layer 0 to 1 in GPT2-XL.
GeoJEPA: Towards Eliminating Augmentation- and Sampling Bias in Multimodal Geospatial Learning
Lundqvist, Theodor, Delvret, Ludvig
Existing methods for self-supervised representation learning of geospatial regions and map entities rely extensively on the design of pretext tasks, often involving augmentations or heuristic sampling of positive and negative pairs based on spatial proximity. This reliance introduces biases and limits the representations' expressiveness and generalisability. Consequently, the literature has expressed a pressing need to explore different methods for modelling geospatial data. To address the key difficulties of such methods, namely multimodality, heterogeneity, and the choice of pretext tasks, we present GeoJEPA, a versatile multimodal fusion model for geospatial data built on the self-supervised Joint-Embedding Predictive Architecture. With GeoJEPA, we aim to eliminate the widely accepted augmentation- and sampling biases found in self-supervised geospatial representation learning. GeoJEPA uses self-supervised pretraining on a large dataset of OpenStreetMap attributes, geometries and aerial images. The results are multimodal semantic representations of urban regions and map entities that we evaluate both quantitatively and qualitatively. Through this work, we uncover several key insights into JEPA's ability to handle multimodal data.
Between Innovation and Oversight: A Cross-Regional Study of AI Risk Management Frameworks in the EU, U.S., UK, and China
As artificial intelligence (AI) technologies increasingly enter important sectors like healthcare, transportation, and finance, the development of effective governance frameworks is crucial for dealing with ethical, security, and societal risks. This paper conducts a comparative analysis of AI risk management strategies across the European Union (EU), United States (U.S.), United Kingdom (UK), and China. A multi-method qualitative approach, including comparative policy analysis, thematic analysis, and case studies, investigates how these regions classify AI risks, implement compliance measures, structure oversight, prioritize transparency, and respond to emerging innovations. Examples from high-risk contexts like healthcare diagnostics, autonomous vehicles, fintech, and facial recognition demonstrate the advantages and limitations of different regulatory models. The findings show that the EU implements a structured, risk-based framework that prioritizes transparency and conformity assessments, while the U.S. uses decentralized, sector-specific regulations that promote innovation but may lead to fragmented enforcement. The flexible, sector-specific strategy of the UK facilitates agile responses but may lead to inconsistent coverage across domains. China's centralized directives allow rapid large-scale implementation while constraining public transparency and external oversight. These insights show the necessity for AI regulation that is globally informed yet context-sensitive, aiming to balance effective risk management with technological progress. The paper concludes with policy recommendations and suggestions for future research aimed at enhancing effective, adaptive, and inclusive AI governance globally.
Complex Networks for Pattern-Based Data Classification
Chire, Josimar, Mahmood, Khalid, Liang, Zhao
Data classification techniques partition the data or feature space into smaller sub-spaces, each corresponding to a specific class. To classify into subspaces, physical features e.g., distance and distributions are utilized. This approach is challenging for the characterization of complex patterns that are embedded in the dataset. However, complex networks remain a powerful technique for capturing internal relationships and class structures, enabling High-Level Classification. Although several complex network-based classification techniques have been proposed, high-level classification by leveraging pattern formation to classify data has not been utilized. In this work, we present two network-based classification techniques utilizing unique measures derived from the Minimum Spanning Tree and Single Source Shortest Path. These network measures are evaluated from the data patterns represented by the inherent network constructed from each class. We have applied our proposed techniques to several data classification scenarios including synthetic and real-world datasets. Compared to the existing classic high-level and machine-learning classification techniques, we have observed promising numerical results for our proposed approaches. Furthermore, the proposed models demonstrate the following distinguished features in comparison to the previous high-level classification techniques: (1) A single network measure is introduced to characterize the data pattern, eliminating the need to determine weight parameters among network measures. Therefore, the model is largely simplified, while obtaining better classification results. (2) The metrics proposed are sensitive and used for classification with competitive results.
Generative Artificial Intelligence: Evolving Technology, Growing Societal Impact, and Opportunities for Information Systems Research
Storey, Veda C., Yue, Wei Thoo, Zhao, J. Leon, Lukyanenko, Roman
The continuing, explosive developments in generative artificial intelligence (GenAI), built on large language models and related algorithms, has led to much excitement and speculation about the potential impact of this new technology. Claims include AI being poised to revolutionize business and society and dramatically change personal life. However, it remains unclear exactly how this technology, with its significantly distinct features from past AI technologies, has transformative potential. Nor is it clear how researchers in information systems (IS) should respond. In this paper, we consider the evolving and emerging trends of AI in order to examine its present and predict its future impacts. Many existing papers on GenAI are either too technical for most IS researchers or lack the depth needed to appreciate the potential impacts of GenAI. We, therefore, attempt to bridge the technical and organizational communities of GenAI from a system-oriented sociotechnical perspective. Specifically, we explore the unique features of GenAI, which are rooted in the continued change from symbolism to connectionism, and the deep systemic and inherent properties of human-AI ecosystems. We retrace the evolution of AI that proceeded the level of adoption, adaption, and use found today, in order to propose future research on various impacts of GenAI in both business and society within the context of information systems research. Our efforts are intended to contribute to the creation of a well-structured research agenda in the IS community to support innovative strategies and operations enabled by this new wave of AI.
A Collection of Innovations in Medical AI for patient records in 2024
The field of Artificial Intelligence in healthcare is evolving at an unprecedented pace, driven by rapid advancements in machine learning and the recent breakthroughs in large language models. While these innovations hold immense potential to transform clinical decision making, diagnostics, and patient care, the accelerating speed of AI development has outpaced traditional academic publishing cycles. As a result, many scholarly contributions quickly become outdated, failing to capture the latest state of the art methodologies and their real world implications. This paper advocates for a new category of academic publications an annualized citation framework that prioritizes the most recent AI driven healthcare innovations. By systematically referencing the breakthroughs of the year, such papers would ensure that research remains current, fostering a more adaptive and informed discourse. This approach not only enhances the relevance of AI research in healthcare but also provides a more accurate reflection of the fields ongoing evolution.