Goto

Collaborating Authors

 Learning Graphical Models


Computational frame analysis revisited: On LLMs for studying news coverage

arXiv.org Artificial Intelligence

Computational approaches have previously shown various promises and pitfalls when it comes to the reliable identification of media frames. Generative LLMs like GPT and Claude are increasingly being used as content analytical tools, but how effective are they for frame analysis? We address this question by systematically evaluating them against their computational predecessors: bag-of-words models and encoder-only transformers; and traditional manual coding procedures. Our analysis rests on a novel gold standard dataset that we inductively and iteratively developed through the study, investigating six months of news coverage of the US Mpox epidemic of 2022. While we discover some potential applications for generative LLMs, we demonstrate that they were consistently outperformed by manual coders, and in some instances, by smaller language models. Some form of human validation was always necessary to determine appropriate model choice. Additionally, by examining how the suitability of various approaches depended on the nature of different tasks that were part of our frame analytical workflow, we provide insights as to how researchers may leverage the complementarity of these approaches to use them in tandem. We conclude by endorsing a methodologically pluralistic approach and put forth a roadmap for computational frame analysis for researchers going forward.


Q-Learning-Based Time-Critical Data Aggregation Scheduling in IoT

arXiv.org Artificial Intelligence

Time-critical data aggregation in Internet of Things (IoT) networks demands efficient, collision-free scheduling to minimize latency for applications like smart cities and industrial automation. Traditional heuristic methods, with two-phase tree construction and scheduling, often suffer from high computational overhead and suboptimal delays due to their static nature. To address this, we propose a novel Q-learning framework that unifies aggregation tree construction and scheduling, modeling the process as a Markov Decision Process (MDP) with hashed states for scalability. By leveraging a reward function that promotes large, interference-free batch transmissions, our approach dynamically learns optimal scheduling policies. Simulations on static networks with up to 300 nodes demonstrate up to 10.87% lower latency compared to a state-of-the-art heuristic algorithm, highlighting its robustness for delay-sensitive IoT applications. This framework enables timely insights in IoT environments, paving the way for scalable, low-latency data aggregation.


AURA: Adaptive Unified Reasoning and Automation with LLM-Guided MARL for NextG Cellular Networks

arXiv.org Artificial Intelligence

Next-generation (NextG) cellular networks are expected to manage dynamic traffic while sustaining high performance. Large language models (LLMs) provide strategic reasoning for 6G planning, but their computational cost and latency limit real-time use. Multi-agent reinforcement learning (MARL) supports localized adaptation, yet coordination at scale remains challenging. We present AURA, a framework that integrates cloud-based LLMs for high-level planning with base stations modeled as MARL agents for local decision-making. The LLM generates objectives and subgoals from its understanding of the environment and reasoning capabilities, while agents at base stations execute these objectives autonomously, guided by a trust mechanism that balances local learning with external input. To reduce latency, AURA employs batched communication so that agents update the LLM's view of the environment and receive improved feedback. In a simulated 6G scenario, AURA improves resilience, reducing dropped handoff requests by more than half under normal and high traffic and lowering system failures. Agents use LLM input in fewer than 60\% of cases, showing that guidance augments rather than replaces local adaptability, thereby mitigating latency and hallucination risks. These results highlight the promise of combining LLM reasoning with MARL adaptability for scalable, real-time NextG network management.


Feasibility of Embodied Dynamics Based Bayesian Learning for Continuous Pursuit Motion Control of Assistive Mobile Robots in the Built Environment

arXiv.org Artificial Intelligence

Non-invasive electroencephalography (EEG)-based brain-computer interfaces (BCIs) offer an intuitive means for individuals with severe motor impairments to independently operate assistive robotic wheelchairs and navigate built environments. Despite considerable progress in BCI research, most current motion control systems are limited to discrete commands, rather than supporting continuous pursuit, where users can freely adjust speed and direction in real time. Such natural mobility control is, however, essential for wheelchair users to navigate complex public spaces, such as transit stations, airports, hospitals, and indoor corridors, to interact socially with the dynamic populations with agility, and to move flexibly and comfortably as autonomous driving is refined to allow movement at will. In this study, we address the gap of continuous pursuit motion control in BCIs by proposing and validating a brain-inspired Bayesian inference framework, where embodied dynamics in acceleration-based motor representations are decoded. This approach contrasts with conventional kinematics-level decoding and deep learning-based methods. Using a public dataset with sixteen hours of EEG from four subjects performing motor imagery-based target-following, we demonstrate that our method, utilizing Automatic Relevance Determination for feature selection and continual online learning, reduces the normalized mean squared error between predicted and true velocities by 72% compared to autoregressive and EEGNet-based methods in a session-accumulative transfer learning setting. Theoretically, these findings empirically support embodied cognition theory and reveal the brain's intrinsic motor control dynamics in an embodied and predictive nature. Practically, grounding EEG decoding in the same dynamical principles that govern biological motion offers a promising path toward more stable and intuitive BCI control.


ReBaPL: Repulsive Bayesian Prompt Learning

arXiv.org Artificial Intelligence

Prompt learning has emerged as an effective technique for fine-tuning large-scale foundation models for downstream tasks. However, conventional prompt tuning methods are prone to overfitting and can struggle with out-of-distribution generalization. To address these limitations, Bayesian prompt learning has been proposed, which frames prompt optimization as a Bayesian inference problem to enhance robustness. This paper introduces Repulsive Bayesian Prompt Learning (ReBaPL), a novel method for Bayesian prompt learning, designed to efficiently explore the complex and often multimodal posterior landscape of prompts. Our method integrates a cyclical step-size schedule with a stochastic gradient Hamiltonian Monte Carlo (SGHMC) algorithm, enabling alternating phases of exploration to discover new modes, and exploitation to refine existing modes. Furthermore, we introduce a repulsive force derived from a potential function over probability metrics (including Maximum Mean Discrepancy and Wasserstein distance) computed on the distributions of representations produced by different prompts. This representation-space repulsion diversifies exploration and prevents premature collapse to a single mode. Our approach allows for a more comprehensive characterization of the prompt posterior distribution, leading to improved generalization. In contrast to prior Bayesian prompt learning methods, our method provides a modular plug-and-play Bayesian extension of any existing prompt learning method based on maximum likelihood estimation. We demonstrate the efficacy of ReBaPL on several benchmark datasets, showing superior performance over state-of-the-art methods for prompt learning.


Patient-level Information Extraction by Consistent Integration of Textual and Tabular Evidence with Bayesian Networks

arXiv.org Artificial Intelligence

Electronic health records (EHRs) form an invaluable resource for training clinical decision support systems. To leverage the potential of such systems in high-risk applications, we need large, structured tabular datasets on which we can build transparent feature-based models. While part of the EHR already contains structured information (e.g. diagnosis codes, medications, and lab results), much of the information is contained within unstructured text (e.g. discharge summaries and nursing notes). In this work, we propose a method for multi-modal patient-level information extraction that leverages both the tabular features available in the patient's EHR (using an expert-informed Bayesian network) as well as clinical notes describing the patient's symptoms (using neural text classifiers). We propose the use of virtual evidence augmented with a consistency node to provide an interpretable, probabilistic fusion of the models' predictions. The consistency node improves the calibration of the final predictions compared to virtual evidence alone, allowing the Bayesian network to better adjust the neural classifier's output to handle missing information and resolve contradictions between the tabular and text data. We show the potential of our method on the SimSUM dataset, a simulated benchmark linking tabular EHRs with clinical notes through expert knowledge.


Comparing verbal, visual and combined explanations for Bayesian Network inferences

arXiv.org Artificial Intelligence

Bayesian Networks (BNs) are an important tool for assisting probabilistic reasoning, but despite being considered transparent models, people have trouble understanding them. Further, current User Interfaces (UIs) still do not clarify the reasoning of BNs. To address this problem, we have designed verbal and visual extensions to the standard BN UI, which can guide users through common inference patterns. We conducted a user study to compare our verbal, visual and combined UI extensions, and a baseline UI. Our main findings are: (1) users did better with all three types of extensions than with the baseline UI for questions about the impact of an observation, the paths that enable this impact, and the way in which an observation influences the impact of other observations; and (2) using verbal and visual modalities together is better than using either modality alone for some of these question types.


Hybrid Differential Reward: Combining Temporal Difference and Action Gradients for Efficient Multi-Agent Reinforcement Learning in Cooperative Driving

arXiv.org Artificial Intelligence

In multi-vehicle cooperative driving tasks involving high-frequency continuous control, traditional state-based reward functions suffer from the issue of vanishing reward differences. This phenomenon results in a low signal-to-noise ratio (SNR) for policy gradients, significantly hindering algorithm convergence and performance improvement. To address this challenge, this paper proposes a novel Hybrid Differential Reward (HDR) mechanism. We first theoretically elucidate how the temporal quasi-steady nature of traffic states and the physical proximity of actions lead to the failure of traditional reward signals. Building on this analysis, the HDR framework innovatively integrates two complementary components: (1) a Temporal Difference Reward (TRD) based on a global potential function, which utilizes the evolutionary trend of potential energy to ensure optimal policy invariance and consistency with long-term objectives; and (2) an Action Gradient Reward (ARG), which directly measures the marginal utility of actions to provide a local guidance signal with a high SNR. Furthermore, we formulate the cooperative driving problem as a Multi-Agent Partially Observable Markov Game (POMDPG) with a time-varying agent set and provide a complete instantiation scheme for HDR within this framework. Extensive experiments conducted using both online planning (MCTS) and Multi-Agent Reinforcement Learning (QMIX, MAPPO, MADDPG) algorithms demonstrate that the HDR mechanism significantly improves convergence speed and policy stability. The results confirm that HDR guides agents to learn high-quality cooperative policies that effectively balance traffic efficiency and safety.


Predicting the Formation of Induction Heads

arXiv.org Artificial Intelligence

Arguably, specialized attention heads dubbed induction heads (IHs) underlie the remarkable in-context learning (ICL) capabilities of modern language models (LMs); yet, a precise characterization of their formation remains unclear. In this study, we investigate the relationship between statistical properties of training data (for both natural and synthetic data) and IH formation. We show that (1) a simple equation combining batch size and context size predicts the point at which IHs form; (2) surface bigram repetition frequency and reliability strongly affect the formation of IHs, and we find a precise Pareto frontier in terms of these two values; and (3) local dependency with high bigram repetition frequency and reliability is sufficient for IH formation, but when the frequency and reliability are low, categoriality and the shape of the marginal distribution matter.


Sex and age determination in European lobsters using AI-Enhanced bioacoustics

arXiv.org Artificial Intelligence

Monitoring aquatic species, especially elusive ones like lobsters, presents challenges. This study focuses on Homarus gammarus (European lobster), a key species for fisheries and aquaculture, and leverages non-invasive Passive Acoustic Monitoring (PAM). Understanding lobster habitats, welfare, reproduction, sex, and age is crucial for management and conservation. While bioacoustic emissions have classified various aquatic species using Artificial Intelligence (AI) models, this research specifically uses H. gammarus bioacoustics (buzzing/carapace vibrations) to classify lobsters by age (juvenile/adult) and sex (male/female). The dataset was collected at Johnshaven, Scotland, using hydrophones in concrete tanks. We explored the efficacy of Deep Learning (DL) models (1D-CNN, 1D-DCNN) and six Machine Learning (ML) models (SVM, k-NN, Naive Bayes, Random Forest, XGBoost, MLP). Mel-frequency cepstral coefficients (MFCCs) were used as features. For age classification (adult vs. juvenile), most models achieved over 97% accuracy (Naive Bayes: 91.31%). For sex classification, all models except Naive Bayes surpassed 93.23%. These strong results demonstrate the potential of supervised ML and DL to extract age- and sex-related features from lobster sounds. This research offers a promising non-invasive PAM approach for lobster conservation, detection, and management in aquaculture and fisheries, enabling real-world edge computing applications for underwater species.