Oceania
Agent-based Modeling meets the Capability Approach for Human Development: Simulating Homelessness Policy-making
Aguilera, Alba, Osman, Nardine, Curto, Georgina
The global rise in homelessness calls for urgent and alternative policy solutions. Non-profits and governmental organizations alert about the many challenges faced by people experiencing homelessness (PEH), which include not only the lack of shelter but also the lack of opportunities for personal development. In this context, the capability approach (CA), which underpins the United Nations Sustainable Development Goals (SDGs), provides a comprehensive framework to assess inequity in terms of real opportunities. This paper explores how the CA can be combined with agent-based modelling and reinforcement learning. The goals are: (1) implementing the CA as a Markov Decision Process (MDP), (2) building on such MDP to develop a rich decision-making model that accounts for more complex motivators of behaviour, such as values and needs, and (3) developing an agent-based simulation framework that allows to assess alternative policies aiming to expand or restore people's capabilities. The framework is developed in a real case study of health inequity and homelessness, working in collaboration with stakeholders, non-profits and domain experts. The ultimate goal of the project is to develop a novel agent-based simulation framework, rooted in the CA, which can be replicated in a diversity of social contexts to assess policies in a non-invasive way.
Natural Language Processing for Electronic Health Records in Scandinavian Languages: Norwegian, Swedish, and Danish
Woldaregay, Ashenafi Zebene, Lund, Jรธrgen Aarmo, Ngo, Phuong Dinh, Tayefi, Mariyam, Burman, Joel, Hansen, Stine, Sillesen, Martin Hylleholt, Dalianis, Hercules, Jenssen, Robert, Ole, Lindsetmo Rolf, Mikalsen, Karl รyvind
Background: Clinical natural language processing (NLP) refers to the use of computational methods for extracting, processing, and analyzing unstructured clinical text data, and holds a huge potential to transform healthcare in various clinical tasks. Objective: The study aims to perform a systematic review to comprehensively assess and analyze the state-of-the-art NLP methods for the mainland Scandinavian clinical text. Method: A literature search was conducted in various online databases including PubMed, ScienceDirect, Google Scholar, ACM digital library, and IEEE Xplore between December 2022 and February 2024. Further, relevant references to the included articles were also used to solidify our search. The final pool includes articles that conducted clinical NLP in the mainland Scandinavian languages and were published in English between 2010 and 2024. Results: Out of the 113 articles, 18% (n=21) focus on Norwegian clinical text, 64% (n=72) on Swedish, 10% (n=11) on Danish, and 8% (n=9) focus on more than one language. Generally, the review identified positive developments across the region despite some observable gaps and disparities between the languages. There are substantial disparities in the level of adoption of transformer-based models. In essential tasks such as de-identification, there is significantly less research activity focusing on Norwegian and Danish compared to Swedish text. Further, the review identified a low level of sharing resources such as data, experimentation code, pre-trained models, and rate of adaptation and transfer learning in the region. Conclusion: The review presented a comprehensive assessment of the state-of-the-art Clinical NLP for electronic health records (EHR) text in mainland Scandinavian languages and, highlighted the potential barriers and challenges that hinder the rapid advancement of the field in the region.
MAGIC-VQA: Multimodal And Grounded Inference with Commonsense Knowledge for Visual Question Answering
Yang, Shuo, Luo, Siwen, Han, Soyeon Caren, Hovy, Eduard
Visual Question Answering (VQA) requires reasoning across visual and textual modalities, yet Large Vision-Language Models (LVLMs) often lack integrated commonsense knowledge, limiting their robustness in real-world scenarios. To address this, we introduce MAGIC-VQA, a novel framework that enhances VQA by systematically integrating commonsense knowledge with LVLMs. MAGIC-VQA employs a three-stage process: (1) Explicit Knowledge Integration from external sources, (2) By-Type Post-Processing for contextual refinement, and (3) Implicit Knowledge Augmentation using a Graph Neural Network (GNN) for structured reasoning. While GNNs bring greater depth to structured inference, they enable superior relational inference beyond LVLMs. MAGIC-VQA bridges a key gap by unifying commonsensse knowledge with LVLM-driven reasoning, eliminating the need for extensive pre-training or complex prompt tuning. Our framework achieves state-of-the-art performance on benchmark datasets, significantly improving commonsense reasoning in VQA.
Construction Identification and Disambiguation Using BERT: A Case Study of NPN
Scivetti, Wesley, Schneider, Nathan
Construction Grammar hypothesizes that knowledge of a language consists chiefly of knowledge of form-meaning pairs (''constructions'') that include vocabulary, general grammar rules, and even idiosyncratic patterns. Recent work has shown that transformer language models represent at least some constructional patterns, including ones where the construction is rare overall. In this work, we probe BERT's representation of the form and meaning of a minor construction of English, the NPN (noun-preposition-noun) construction -- exhibited in such expressions as face to face and day to day -- which is known to be polysemous. We construct a benchmark dataset of semantically annotated corpus instances (including distractors that superficially resemble the construction). With this dataset, we train and evaluate probing classifiers. They achieve decent discrimination of the construction from distractors, as well as sense disambiguation among true instances of the construction, revealing that BERT embeddings carry indications of the construction's semantics. Moreover, artificially permuting the word order of true construction instances causes them to be rejected, indicating sensitivity to matters of form. We conclude that BERT does latently encode at least some knowledge of the NPN construction going beyond a surface syntactic pattern and lexical cues.
An Identity and Interaction Based Network Forensic Analysis
Clarke, Nathan, Alotibi, Gaseb, Joy, Dany, Li, Fudong, Furnell, Steven, Alshumrani, Ali, Mohammed, Hussan
In todays landscape of increasing electronic crime, network forensics plays a pivotal role in digital investigations. It aids in understanding which systems to analyse and as a supplement to support evidence found through more traditional computer based investigations. However, the nature and functionality of the existing Network Forensic Analysis Tools (NFATs) fall short compared to File System Forensic Analysis Tools (FS FATs) in providing usable data. The analysis tends to focus upon IP addresses, which are not synonymous with user identities, a point of significant interest to investigators. This paper presents several experiments designed to create a novel NFAT approach that can identify users and understand how they are using network based applications whilst the traffic remains encrypted. The experiments build upon the prior art and investigate how effective this approach is in classifying users and their actions. Utilising an in-house dataset composed of 50 million packers, the experiments are formed of three incremental developments that assist in improving performance. Building upon the successful experiments, a proposed NFAT interface is presented to illustrate the ease at which investigators would be able to ask relevant questions of user interactions. The experiments profiled across 27 users, has yielded an average 93.3% True Positive Identification Rate (TPIR), with 41% of users experiencing 100% TPIR. Skype, Wikipedia and Hotmail services achieved a notably high level of recognition performance. The study has developed and evaluated an approach to analyse encrypted network traffic more effectively through the modelling of network traffic and to subsequently visualise these interactions through a novel network forensic analysis tool.
Leveraging Large Language Models for Automated Causal Loop Diagram Generation: Enhancing System Dynamics Modeling through Curated Prompting Techniques
Liu, Ning-Yuan Georgia, Keith, David R.
T ransforming a dynamic hypothesis into a causal loop diagram (CLD) is crucial for System Dynamics Modelling. Extracting key variables and causal relationships from text to build a CLD is often challenging and time - consuming for novice modelers, limiting SD tool adoption. This paper introduces and tests a method for automating the translation of dynamic hypotheses into CLDs using large language models (LLMs) with curated prompting techniques. We first describe how LLMs work and how they can make the inferences needed to build CLDs using a standard digraph structure. Next, we develop a set of simple dynamic hypothe ses and corresponding CLDs from leading SD textbooks. We then compare the four different combinations of prompting technique s, evaluating their performance against CLD s labeled by expert modelers . Results show that for simple model structures and using curated prompting techniques, LLMs can generate CLDs of a similar quality to expert - built ones, accelerating CLD creation.
FedSKD: Aggregation-free Model-heterogeneous Federated Learning using Multi-dimensional Similarity Knowledge Distillation
Weng, Ziqiao, Cai, Weidong, Zhou, Bo
Federated learning (FL) enables privacy-preserving collaborative model training without direct data sharing. Model-heterogeneous FL (MHFL) extends this paradigm by allowing clients to train personalized models with heterogeneous architectures tailored to their computational resources and application-specific needs. However, existing MHFL methods predominantly rely on centralized aggregation, which introduces scalability and efficiency bottlenecks, or impose restrictions requiring partially identical model architectures across clients. While peer-to-peer (P2P) FL removes server dependence, it suffers from model drift and knowledge dilution, limiting its effectiveness in heterogeneous settings. To address these challenges, we propose FedSKD, a novel MHFL framework that facilitates direct knowledge exchange through round-robin model circulation, eliminating the need for centralized aggregation while allowing fully heterogeneous model architectures across clients. FedSKD's key innovation lies in multi-dimensional similarity knowledge distillation, which enables bidirectional cross-client knowledge transfer at batch, pixel/voxel, and region levels for heterogeneous models in FL. This approach mitigates catastrophic forgetting and model drift through progressive reinforcement and distribution alignment while preserving model heterogeneity. Extensive evaluations on fMRI-based autism spectrum disorder diagnosis and skin lesion classification demonstrate that FedSKD outperforms state-of-the-art heterogeneous and homogeneous FL baselines, achieving superior personalization (client-specific accuracy) and generalization (cross-institutional adaptability). These findings underscore FedSKD's potential as a scalable and robust solution for real-world medical federated learning applications.
Mapping Hymns and Organizing Concepts in the Rigveda: Quantitatively Connecting the Vedic Suktas
Bollineni, Venkatesh, Crk, Igor, Gultepe, Eren
Accessing and gaining insight into the Rigveda poses a non-trivial challenge due to its extremely ancient Sanskrit language, poetic structure, and large volume of text. By using NLP techniques, this study identified topics and semantic connections of hymns within the Rigveda that were corroborated by seven well-known groupings of hymns. The 1,028 suktas (hymns) from the modern English translation of the Rigveda by Jamison and Brereton were preprocessed and sukta-level embeddings were obtained using, i) a novel adaptation of LSA, presented herein, ii) SBERT, and iii) Doc2Vec embeddings. Following an UMAP dimension reduction of the vectors, the network of suktas was formed using k-nearest neighbours. Then, community detection of topics in the sukta networks was performed with the Louvain, Leiden, and label propagation methods, whose statistical significance of the formed topics were determined using an appropriate null distribution. Only the novel adaptation of LSA using the Leiden method, had detected sukta topic networks that were significant (z = 2.726, p < .01) with a modularity score of 0.944. Of the seven famous sukta groupings analyzed (e.g., creation, funeral, water, etc.) the LSA derived network was successful in all seven cases, while Doc2Vec was not significant and failed to detect the relevant suktas. SBERT detected four of the famous suktas as separate groups, but mistakenly combined three of them into a single mixed group. Also, the SBERT network was not statistically significant.
PG-SAM: Prior-Guided SAM with Medical for Multi-organ Segmentation
Zhong, Yiheng, Luo, Zihong, Liu, Chengzhi, Tang, Feilong, Peng, Zelin, Hu, Ming, Hu, Yingzhen, Su, Jionglong, Geand, Zongyuan, Razzak, Imran
Segment Anything Model (SAM) demonstrates powerful zero-shot capabilities; however, its accuracy and robustness significantly decrease when applied to medical image segmentation. Existing methods address this issue through modality fusion, integrating textual and image information to provide more detailed priors. In this study, we argue that the granularity of text and the domain gap affect the accuracy of the priors. Furthermore, the discrepancy between high-level abstract semantics and pixel-level boundary details in images can introduce noise into the fusion process. To address this, we propose Prior-Guided SAM (PG-SAM), which employs a fine-grained modality prior aligner to leverage specialized medical knowledge for better modality alignment. The core of our method lies in efficiently addressing the domain gap with fine-grained text from a medical LLM. Meanwhile, it also enhances the priors' quality after modality alignment, ensuring more accurate segmentation. In addition, our decoder enhances the model's expressive capabilities through multi-level feature fusion and iterative mask optimizer operations, supporting unprompted learning. We also propose a unified pipeline that effectively supplies high-quality semantic information to SAM. Extensive experiments on the Synapse dataset demonstrate that the proposed PG-SAM achieves state-of-the-art performance. Our anonymous code is released at https://github.com/logan-0623/PG-SAM.
DiffGED: Computing Graph Edit Distance via Diffusion-based Graph Matching
Huang, Wei, Wang, Hanchen, Wen, Dong, Zhang, Wenjie, Zhang, Ying, Lin, Xuemin
The Graph Edit Distance (GED) problem, which aims to compute the minimum number of edit operations required to transform one graph into another, is a fundamental challenge in graph analysis with wide-ranging applications. However, due to its NP-hard nature, traditional A* approaches often suffer from scalability issue, making them computationally intractable for large graphs. Many recent deep learning frameworks address GED by formulating it as a regression task, which, while efficient, fails to recover the edit path -- a central interest in GED. Furthermore, recent hybrid approaches that combine deep learning with traditional methods to recover the edit path often yield poor solution quality. These methods also struggle to generate candidate solutions in parallel, resulting in increased running times.In this paper, we present a novel approach, DiffGED, that leverages generative diffusion model to solve GED and recover the corresponding edit path. Specifically, we first generate multiple diverse node matching matrices in parallel through a diffusion-based graph matching model. Next, node mappings are extracted from each generated matching matrices in parallel, and each extracted node mapping can be simply transformed into an edit path. Benefiting from the generative diversity provided by the diffusion model, DiffGED is less likely to fall into local sub-optimal solutions, thereby achieving superior overall solution quality close to the exact solution. Experimental results on real-world datasets demonstrate that DiffGED can generate multiple diverse edit paths with exceptionally high accuracy comparable to exact solutions while maintaining a running time shorter than most of hybrid approaches.