Goto

Collaborating Authors

 Uncertainty


Structure and Destructure: Dual Forces in the Making of Knowledge Engines

arXiv.org Artificial Intelligence

The making of knowledge engines in natural language processing has been shaped by two seemingly distinct paradigms: one grounded in structure, the other driven by massively available unstructured data. The structured paradigm leverages predefined symbolic interactions, such as knowledge graphs, as priors and designs models to capture them. In contrast, the unstructured paradigm centers on scaling transformer architectures with increasingly vast data and model sizes, as seen in modern large language models. Despite their divergence, this thesis seeks to establish conceptual connections bridging these paradigms. Two complementary forces, structure and destructure, emerge across both paradigms: structure organizes seen symbolic interactions, while destructure, through periodic embedding resets, improves model plasticity and generalization to unseen scenarios. These connections form a new recipe for developing general knowledge engines that can support transparent, controllable, and adaptable intelligent systems.


Financial Decision Making using Reinforcement Learning with Dirichlet Priors and Quantum-Inspired Genetic Optimization

arXiv.org Artificial Intelligence

Traditional budget allocation models struggle with the stochastic and nonlinear nature of real-world financial data. This study proposes a hybrid reinforcement learning (RL) framework for dynamic budget allocation, enhanced with Dirichlet-inspired stochasticity and quantum mutation-based genetic optimization. Using Apple Inc. quarterly financial data (2009 to 2025), the RL agent learns to allocate budgets between Research and Development and Selling, General and Administrative to maximize profitability while adhering to historical spending patterns, with L2 penalties discouraging unrealistic deviations. A Dirichlet distribution governs state evolution to simulate shifting financial contexts. To escape local minima and improve generalization, the trained policy is refined using genetic algorithms with quantum mutation via parameterized qubit rotation circuits. Generation-wise rewards and penalties are logged to visualize convergence and policy behavior. On unseen fiscal data, the model achieves high alignment with actual allocations (cosine similarity 0.9990, KL divergence 0.0023), demonstrating the promise of combining deep RL, stochastic modeling, and quantum-inspired heuristics for adaptive enterprise budgeting.


Data-driven Discovery of Digital Twins in Biomedical Research

arXiv.org Artificial Intelligence

Recent technological advances have expanded the availability of high-throughput biological datasets, enabling the reliable design of digital twins of biomedical systems or patients. Such computational tools represent key reaction networks driving perturbation or drug response and can guide drug discovery and personalized therapeutics. Yet, their development still relies on laborious data integration by the human modeler, so that automated approaches are critically needed. The success of data-driven system discovery in Physics, rooted in clean datasets and well-defined governing laws, has fueled interest in applying similar techniques in Biology, which presents unique challenges. Here, we reviewed methodologies for automatically inferring digital twins from biological time series, which mostly involve symbolic or sparse regression. We evaluate algorithms according to eight biological and methodological challenges, associated to noisy/incomplete data, multiple conditions, prior knowledge integration, latent variables, high dimensionality, unobserved variable derivatives, candidate library design, and uncertainty quantification. Upon these criteria, sparse regression generally outperformed symbolic regression, particularly when using Bayesian frameworks. We further highlight the emerging role of deep learning and large language models, which enable innovative prior knowledge integration, though the reliability and consistency of such approaches must be improved. While no single method addresses all challenges, we argue that progress in learning digital twins will come from hybrid and modular frameworks combining chemical reaction network-based mechanistic grounding, Bayesian uncertainty quantification, and the generative and knowledge integration capacities of deep learning. To support their development, we further propose a benchmarking framework to evaluate methods across all challenges.


Structuring GUI Elements through Vision Language Models: Towards Action Space Generation

arXiv.org Artificial Intelligence

Multimodal large language models (MLLMs) have emerged as pivotal tools in enhancing human-computer interaction. In this paper we focus on the application of MLLMs in the field of graphical user interface (GUI) elements structuring, where they assist in processing user instructions based on screen contents. Despite the promise of MLLMs, their performance in precisely generating UI element coordinates, a critical aspect of GUI understanding, is hindered by the nature of next-token prediction training. This challenge arises from the semantic void surrounding numerical UI coordinates in language representation spaces, necessitating a substantial and diverse dataset to bolster visual module capabilities. T o address these limitations, we introduce an IoU-Augmented Maximum Likelihood (IAML) training paradigm. Specifically, our approach involves a novel pipeline for IoU-based coordinate sampling to augment the training data, which considers the proximity to ground truth coordinates. This data augmentation strategy is then employed to fine-tune MLLMs under the IAML paradigm, which is designed to mitigate the exposure bias problem inherent in traditional maximum likelihood estimation. Through extensive experiments, we demonstrate the superior performance of our IAML training approach over traditional training paradigms.


Flow-Modulated Scoring for Semantic-Aware Knowledge Graph Completion

arXiv.org Artificial Intelligence

Y et, prevailing methods, which rely on static scoring functions over learned embeddings, struggling to simultaneously capture rich semantic context and the dynamic nature of relations. T o overcome this limitation, we propose the Flow-Modulated Scoring (FMS) framework, conceptualizing a relation as a dynamic evolutionary process governed by its static semantic environment. FMS operates in two stages: it first learns context-aware entity embeddings via a Semantic Context Learning module, and then models a dynamic flow between them using a Conditional Flow-Matching module. This learned flow dynamically modulates a base static score for the entity pair. By unifying context-rich static representations with a conditioned dynamic flow, FMS achieves a more comprehensive understanding of relational semantics. Extensive experiments demonstrate that FMS establishes a new state of the art across both canonical knowledge graph completion tasks: relation prediction and entity prediction. On the standard relation prediction benchmark FB15k-237, FMS achieves a near-perfect MRR of 99.8% and Hits@1 of 99.7% using a mere 0.35M parameters, while also attaining a 99.9% MRR on WN18RR. Its dominance extends to entity prediction, where it secures a 25.2% relative MRR gain in the transductive setting and substantially outperforms all baselines in challenging inductive settings. By unifying a dynamic flow mechanism with rich static contexts, FMS offers a highly effective and parameter-efficient new paradigm for knowledge graph completion.


Privacy Auditing Synthetic Data Release through Local Likelihood Attacks

arXiv.org Machine Learning

Auditing the privacy leakage of synthetic data is an important but unresolved problem. Most existing privacy auditing frameworks for synthetic data rely on heuristics and unreasonable assumptions to attack the failure modes of generative models, exhibiting limited capability to describe and detect the privacy exposure of training data through synthetic data release. In this paper, we study designing Membership Inference Attacks (MIAs) that specifically exploit the observation that tabular generative models tend to significantly overfit to certain regions of the training distribution. Here, we propose Generative Likelihood Ratio Attack (Gen-LRA), a novel, computationally efficient No-Box MIA that, with no assumption of model knowledge or access, formulates its attack by evaluating the influence a test observation has in a surrogate model's estimation of a local likelihood ratio over the synthetic data. Assessed over a comprehensive benchmark spanning diverse datasets, model architectures, and attack parameters, we find that Gen-LRA consistently dominates other MIAs for generative models across multiple performance metrics. These results underscore Gen-LRA's effectiveness as a privacy auditing tool for the release of synthetic data, highlighting the significant privacy risks posed by generative model overfitting in real-world applications.


Comprehensive Signal Quality Evaluation of a Wearable Textile ECG Garment: A Sex-Balanced Study

arXiv.org Artificial Intelligence

--We introduce a novel wearable textile-garment featuring an innovative electrode placement aimed at minimizing noise and motion artifacts, thereby enhancing signal fidelity in Electrocardiography (ECG) recordings. We present a comprehensive, sex-balanced evaluation involving 15 healthy males and 15 healthy female participants to ensure the device's suitability across anatomical and physiological variations. The assessment framework encompasses distinct evaluation approaches: quantitative signal quality indices to objectively benchmark device performance; rhythm-based analyzes of physiological parameters such as heart rate (HR) and heart rate variability (HRV); machine learning classification tasks to assess application-relevant predictive utility; morphological analysis of ECG features including amplitude and interval parameters; and investigations of the effects of electrode projection angle given by the textile / body shape, with all analyzes stratified by sex to elucidate sex-specific influences. Evaluations were conducted across various activity phases representing real-world conditions. The results demonstrate that the textile system achieves signal quality highly concordant with reference devices in both rhythm and morphological analyses, exhibits robust classification performance, and enables identification of key sex-specific determinants affecting signal acquisition. These findings underscore the practical viability of textile-based ECG garments for physiological monitoring as well as psychophysiological state detection. Moreover, we identify the importance of incorporating sex-specific design considerations to ensure equitable and reliable cardiac diagnostics in wearable health technologies. NTRODUCTION This is a preprint of a manuscript submitted for publication. It has not yet been peer-reviewed, and the final version may differ . The authors acknowledge the funding by the EU TEF-Health project which is part of the Digital Europe Program of the EU (DIGIT AL-2022-CLOUD-AI-02-TEFHEAL TH). LECTROCARDIOGRAPHIC recordings serve as a fundamental diagnostic tool in modern medicine, providing invaluable noninvasive insights into the electrical activity of the heart and therefore the health of the cardiovascular system. Introduced by Willem Einthoven in the early 20th century, Electrocardiography (ECG) remains a cornerstone in clinical cardiology. Einthoven's pioneering work laid the foundation for understanding the principles underlying ECG acquisition and interpretation [1], [2]. ECG signals are acquired through electrodes placed on the skin, capturing the electrical impulses generated by cardiac muscle de-and repolarization. In modern medicine, ECG is used in applications ranging from diagnosing cardiac arrhythmias [4] and ischemic heart disease [5] to monitoring patients during surgery [6] and assessing the effects of pharmacological interventions [7], [8].


Modeling Wise Decision Making: A Z-Number Fuzzy Framework Inspired by Phronesis

arXiv.org Artificial Intelligence

Background: Wisdom is a superordinate construct that embraces perspective taking, reflectiveness, prosocial orientation, reflective empathetic action, and intellectual humility. Unlike conventional models of reasoning that are rigidly bound by binary thinking, wisdom unfolds in shades of ambiguity, requiring both graded evaluation and self-reflective humility. Current measures depend on self-reports and seldom reflect the humility and uncertainty inherent in wise reasoning. A computational framework that takes into account both multidimensionality and confidence has the potential to improve psychological science and allow humane AI. Method: We present a fuzzy inference system with Z numbers, each of the decisions being expressed in terms of a wisdom score (restriction) and confidence score (certainty). As part of this study, participants (N = 100) were exposed to culturally neutral pictorial moral dilemma tasks to which they generated think-aloud linguistic responses, which were mapped into five theoretically based components of wisdom. The scores of each individual component were combined using a base of 21 rules, with membership functions tuned via Gaussian kernel density estimation. Results: In a proof of concept study, the system produced dual attribute wisdom representations that correlated modestly but significantly with established scales while showing negligible relations with unrelated traits, supporting convergent and divergent validity. Contribution: The contribution is to formalize wisdom as a multidimensional, uncertainty-conscious construct, operationalized in the form of Z-numbers. In addition to progressing measurement in psychology, it calculates how fuzzy Z numbers can provide AI systems with interpretable, confidence-sensitive reasoning that affords a safe, middle ground between rigorous computation and human-like judgment.


Priors Matter: Addressing Misspecification in Bayesian Deep Q-Learning

arXiv.org Artificial Intelligence

Uncertainty quantification in reinforcement learning can greatly improve exploration and robustness. Approximate Bayesian approaches have recently been popularized to quantify uncertainty in model-free algorithms. However, so far the focus has been on improving the accuracy of the posterior approximation, instead of studying the accuracy of the prior and likelihood assumptions underlying the posterior. In this work, we demonstrate that there is a cold posterior effect in Bayesian deep Q-learning, where contrary to theory, performance increases when reducing the temperature of the posterior. To identify and overcome likely causes, we challenge common assumptions made on the likelihood and priors in Bayesian model-free algorithms. We empirically study prior distributions and show through statistical tests that the common Gaussian likelihood assumption is frequently violated. We argue that developing more suitable likelihoods and priors should be a key focus in future Bayesian reinforcement learning research and we offer simple, implementable solutions for better priors in deep Q-learning that lead to more performant Bayesian algorithms.


Controllable 3D Molecular Generation for Structure-Based Drug Design Through Bayesian Flow Networks and Gradient Integration

arXiv.org Artificial Intelligence

Recent advances in Structure-based Drug Design (SBDD) have leveraged generative models for 3D molecular generation, predominantly evaluating model performance by binding affinity to target proteins. However, practical drug discovery necessitates high binding affinity along with synthetic feasibility and selectivity, critical properties that were largely neglected in previous evaluations. To address this gap, we identify fundamental limitations of conventional diffusion-based generative models in effectively guiding molecule generation toward these diverse pharmacological properties. We propose CByG, a novel framework extending Bayesian Flow Network into a gradient-based conditional generative model that robustly integrates property-specific guidance. Additionally, we introduce a comprehensive evaluation scheme incorporating practical benchmarks for binding affinity, synthetic feasibility, and selectivity, overcoming the limitations of conventional evaluation methods. Extensive experiments demonstrate that our proposed CByG framework significantly outperforms baseline models across multiple essential evaluation criteria, highlighting its effectiveness and practicality for real-world drug discovery applications.