Goto

Collaborating Authors

 Lancashire



AdversarialCrowdsourcingThroughRobust Rank-OneMatrixCompletion

Neural Information Processing Systems

Notation and conventions: [n] = {1,,n}; |S| is the size of setP; dxe is the smallest integer greater thanx; bxc is the largest integer smaller thanx; kXk is the nuclear norm of matrixL, i.e., the sum of the singular values of matrixX; Z+ is the set of positive integers;Z i is the set of integers which are greater thani; Given S1, S2, the reduction ofS1 by S2 is denoted as S1\S2={i S1:i / S2};finally,A(n) B(n)meansA(n)/B(n) 1asn .


EnsemblinggeophysicalmodelswithBayesianNeural Networks

Neural Information Processing Systems

Ensembles of geophysical models improve prediction accuracy and express uncertainties. We develop a novel data-driven ensembling strategy for combining geophysical models using Bayesian Neural Networks, which infers spatiotemporally varying model weights and bias, while accounting for heteroscedastic uncertainties in the observations. This produces more accurate and uncertaintyaware predictions without sacrificing interpretability.




Beyond Black-Box Predictions: Identifying Marginal Feature Effects in Tabular Transformer Networks

arXiv.org Machine Learning

In recent years, deep neural networks have showcased their predictive power across a variety of tasks. Beyond natural language processing, the transformer architecture has proven efficient in addressing tabular data problems and challenges the previously dominant gradient-based decision trees in these areas. However, this predictive power comes at the cost of intelligibility: Marginal feature effects are almost completely lost in the black-box nature of deep tabular transformer networks. Alternative architectures that use the additivity constraints of classical statistical regression models can maintain intelligible marginal feature effects, but often fall short in predictive power compared to their more complex counterparts. To bridge the gap between intelligibility and performance, we propose an adaptation of tabular transformer networks designed to identify marginal feature effects. We provide theoretical justifications that marginal feature effects can be accurately identified, and our ablation study demonstrates that the proposed model efficiently detects these effects, even amidst complex feature interactions. To demonstrate the model's predictive capabilities, we compare it to several interpretable as well as black-box models and find that it can match black-box performances while maintaining intelligibility. The source code is available at https://github.com/OpenTabular/NAMpy.


Personality-Driven Decision-Making in LLM-Based Autonomous Agents

arXiv.org Artificial Intelligence

The embedding of Large Language Models (LLMs) into autonomous agents is a rapidly developing field which enables dynamic, configurable behaviours without the need for extensive domain-specific training. In our previous work, we introduced SANDMAN, a Deceptive Agent architecture leveraging the Five-Factor OCEAN personality model, demonstrating that personality induction significantly influences agent task planning. Building on these findings, this study presents a novel method for measuring and evaluating how induced personality traits affect task selection processes - specifically planning, scheduling, and decision-making - in LLM-based agents. Our results reveal distinct task-selection patterns aligned with induced OCEAN attributes, underscoring the feasibility of designing highly plausible Deceptive Agents for proactive cyber defense strategies.


When Discourse Stalls: Moving Past Five Semantic Stopsigns about Generative AI in Design Research

arXiv.org Artificial Intelligence

It has been roughly three years since the open-source release of Stable Diffusion ignited a Generative AI (GenAI) boom [Bengesi et al., 2023]. The proliferation of these technologies has since reshaped design practice and research. From early ideation to final implementation, these developments have significantly altered how design work is conceived, conducted, and evaluated [Hou et al., 2024]. This essay examines the critical juncture at which the design research community finds itself, seeking to understand and shape these developments while grappling with their implications for creative practice, design education, and professional identities. Popular discourse around GenAI often centers on simplified unequivocal narratives: AI as a threat to humanity, as a solution to global challenges, as a force of disruption, or as a replacement for humans [Gilardi et al., 2024]. While these narratives have sparked debate and interest, they can function as "semantic stopsigns"--conceptual framings that oversimplify complex issues, providing an illusion of resolution that hinders deeper inquiry [LessWrong Community, n.d., Lifton, 1961]. For instance, claims like "AI is unreliable" can lead to outright dismissal of its potential,


REFLEX Dataset: A Multimodal Dataset of Human Reactions to Robot Failures and Explanations

arXiv.org Artificial Intelligence

--This work presents REFLEX: Robotic Explanations to FaiLures and Human EXpressions, a comprehensive mul-timodal dataset capturing human reactions to robot failures and subsequent explanations in collaborative settings. It aims to facilitate research into human-robot interaction dynamics, addressing the need to study reactions to both initial failures and explanations, as well as the evolution of these reactions in long-term interactions. By providing rich, annotated data on human responses to different types of failures, explanation levels, and explanation varying strategies, the dataset contributes to the development of more robust, adaptive, and satisfying robotic systems capable of maintaining positive relationships with human collaborators, even during challenges like repeated failures. I NTRODUCTION As robots become increasingly integrated into our everyday lives, from homes and workplaces to public spaces, the need to understand and improve human-robot interaction (HRI) has never been more critical. Despite significant advancements in robotics, they are still prone to failures, ranging from minor glitches to serious malfunctions.


ACCESS : A Benchmark for Abstract Causal Event Discovery and Reasoning

arXiv.org Artificial Intelligence

Identifying cause-and-effect relationships is critical to understanding real-world dynamics and ultimately causal reasoning. Existing methods for identifying event causality in NLP, including those based on Large Language Models (LLMs), exhibit difficulties in out-of-distribution settings due to the limited scale and heavy reliance on lexical cues within available benchmarks. Modern benchmarks, inspired by probabilistic causal inference, have attempted to construct causal graphs of events as a robust representation of causal knowledge, where \texttt{CRAB} \citep{romanou2023crab} is one such recent benchmark along this line. In this paper, we introduce \texttt{ACCESS}, a benchmark designed for discovery and reasoning over abstract causal events. Unlike existing resources, \texttt{ACCESS} focuses on causality of everyday life events on the abstraction level. We propose a pipeline for identifying abstractions for event generalizations from \texttt{GLUCOSE} \citep{mostafazadeh-etal-2020-glucose}, a large-scale dataset of implicit commonsense causal knowledge, from which we subsequently extract $1,4$K causal pairs. Our experiments highlight the ongoing challenges of using statistical methods and/or LLMs for automatic abstraction identification and causal discovery in NLP. Nonetheless, we demonstrate that the abstract causal knowledge provided in \texttt{ACCESS} can be leveraged for enhancing QA reasoning performance in LLMs.