Goto

Collaborating Authors

 South America


Show Me the Work: Fact-Checkers' Requirements for Explainable Automated Fact-Checking

arXiv.org Artificial Intelligence

The pervasiveness of large language models and generative AI in online media has amplified the need for effective automated fact-checking to assist fact-checkers in tackling the increasing volume and sophistication of misinformation. The complex nature of fact-checking demands that automated fact-checking systems provide explanations that enable fact-checkers to scrutinise their outputs. However, it is unclear how these explanations should align with the decision-making and reasoning processes of fact-checkers to be effectively integrated into their workflows. Through semi-structured interviews with fact-checking professionals, we bridge this gap by: (i) providing an account of how fact-checkers assess evidence, make decisions, and explain their processes; (ii) examining how fact-checkers use automated tools in practice; and (iii) identifying fact-checker explanation requirements for automated fact-checking tools. The findings show unmet explanation needs and identify important criteria for replicable fact-checking explanations that trace the model's reasoning path, reference specific evidence, and highlight uncertainty and information gaps.


A Scoresheet for Explainable AI

arXiv.org Artificial Intelligence

Explainability is important for the transparency of autonomous and intelligent systems and for helping to support the development of appropriate levels of trust. There has been considerable work on developing approaches for explaining systems and there are standards that specify requirements for transparency. However, there is a gap: the standards are too high-level and do not adequately specify requirements for explainability. This paper develops a scoresheet that can be used to specify explainability requirements or to assess the explainability aspects provided for particular applications. The scoresheet is developed by considering the requirements of a range of stakeholders and is applicable to Multiagent Systems as well as other AI technologies. We also provide guidance for how to use the scoresheet and illustrate its generality and usefulness by applying it to a range of applications.


How Users Who are Blind or Low Vision Play Mobile Games: Perceptions, Challenges, and Strategies

arXiv.org Artificial Intelligence

As blind and low-vision (BLV) players engage more deeply with games, accessibility features have become essential. While some research has explored tools and strategies to enhance game accessibility, the specific experiences of these players with mobile games remain underexamined. This study addresses this gap by investigating how BLV users experience mobile games with varying accessibility levels. Through interviews with 32 experienced BLV mobile players, we explore their perceptions, challenges, and strategies for engaging with mobile games. Our findings reveal that BLV players turn to mobile games to alleviate boredom, achieve a sense of accomplishment, and build social connections, but face barriers depending on the game's accessibility level. We also compare mobile games to other forms of gaming, highlighting the relative advantages of mobile games, such as the inherent accessibility of smartphones. This study contributes to understanding BLV mobile gaming experiences and provides insights for enhancing accessible mobile game design.


Scalable First-order Method for Certifying Optimal k-Sparse GLMs

arXiv.org Artificial Intelligence

This paper investigates the problem of certifying optimality for sparse generalized linear models (GLMs), where sparsity is enforced through an $\ell_0$ cardinality constraint. While branch-and-bound (BnB) frameworks can certify optimality by pruning nodes using dual bounds, existing methods for computing these bounds are either computationally intensive or exhibit slow convergence, limiting their scalability to large-scale problems. To address this challenge, we propose a first-order proximal gradient algorithm designed to solve the perspective relaxation of the problem within a BnB framework. Specifically, we formulate the relaxed problem as a composite optimization problem and demonstrate that the proximal operator of the non-smooth component can be computed exactly in log-linear time complexity, eliminating the need to solve a computationally expensive second-order cone program. Furthermore, we introduce a simple restart strategy that enhances convergence speed while maintaining low per-iteration complexity. Extensive experiments on synthetic and real-world datasets show that our approach significantly accelerates dual bound computations and is highly effective in providing optimality certificates for large-scale problems.


Abduction of Domain Relationships from Data for VQA

arXiv.org Artificial Intelligence

Visual Question Answering (VQA) is an AI task designed to reason about images. Commonly, the image is transformed into a "scene graph" that enables the deployment of more formal reasoning tools. For example, in recent work, both the scene graph and associated query were represented as an ASP Program [2, 1]; however, notably the scene graph itself only contains information about the scene, but lacks commonsense knowledge - in particular, knowledge about the domains of attributes identified by the scene. Existing work to address this shortcoming relies on leveraging large commonsense knowledge graphs for obtaining domain knowledge [5, 6, 7]. However, such approaches require the ability to accurately align the language of the knowledge graph with the language of the scene graph. Further, for some applications, this does not guarantee that the aligned knowledge graph will necessarily improve VQA performance (e.g., if domain knowledge relevant to the queries is not possessed in the knowledge graph). In this paper, we provide an orthogonal and complementary approach that leverages logical representations of the scene graph and query to abduce domain relationships that can improve query answering performance. We frame the abduction problem and provide a simple algorithm that provides a valid solution. We also provide an implementation and show on a standard dataset that we can improve question answering accuracy from 59.98% to 81.01%, and provide comparable results with few historical examples.


Object-Centric Latent Action Learning

arXiv.org Artificial Intelligence

Leveraging vast amounts of internet video data for Embodied AI is currently bottle-necked by the lack of action annotations and the presence of actioncorrelated distractors. We propose a novel object-centric latent action learning approach, based on VideoSaur and LAPO, that employs self-supervised decomposition of scenes into object representations and annotates video data with proxyaction labels. This method effectively disentangles causal agent-object interactions from irrelevant background noise and reduces the performance degradation of latent action learning approaches caused by distractors. Our preliminary experiments with the Distracting Control Suite show that latent action pretraining based on object decompositions improve the quality of inferred latent actions by x2.7 and efficiency of downstream fine-tuning with a small set of labeled actions, increasing return by x2.6 on average. In recent years, the scaling of model and data sizes has led to the creation of powerful and general foundation models (Bommasani et al., 2021) that have enabled many breakthroughs in understanding and generation of natural language (Achiam et al., 2023; Brown et al., 2020) and images (Dehghani et al., 2023; Radford et al., 2021). On the other hand, the field of embodied AI has generally remained behind in terms of generalization and emergent abilities (Guruprasad et al., 2024), being mostly limited by the lack of diverse data for pre-training (Lin et al., 2024). The vast amount of video data on the Internet, covering a wide variety of human-related activities, can potentially fulfill the current data needs (McCarthy et al., 2024).


Exact Leader Estimation: A New Approach for Distributed Differentiation

arXiv.org Artificial Intelligence

A novel strategy aimed at cooperatively differentiating a signal among multiple interacting agents is introduced, where none of the agents needs to know which agent is the leader, i.e. the one producing the signal to be differentiated. Every agent communicates only a scalar variable to its neighbors; except for the leader, all agents execute the same algorithm. The proposed strategy can effectively obtain derivatives up to arbitrary $m$-th order in a finite time under the assumption that the $(m+1)$-th derivative is bounded. The strategy borrows some of its structure from the celebrated homogeneous robust exact differentiator by A. Levant, inheriting its exact differentiation capability and robustness to measurement noise. Hence, the proposed strategy can be said to perform robust exact distributed differentiation. In addition, and for the first time in the distributed leader-observer literature, sampled-data communication and bounded measurement noise are considered, and corresponding steady-state worst-case accuracy bounds are derived. The effectiveness of the proposed strategy is verified numerically for second- and fourth-order systems, i.e., for estimating derivatives of up to first and third order, respectively.


Perch like a bird: bio-inspired optimal maneuvers and nonlinear control for Flapping-Wing Unmanned Aerial Vehicles

arXiv.org Artificial Intelligence

This research endeavors to design the perching maneuver and control in ornithopter robots. By analyzing the dynamic interplay between the robot's flight dynamics, feedback loops, and the environmental constraints, we aim to advance our understanding of the perching maneuver, drawing parallels to biological systems. Inspired by the elegant control strategies observed in avian flight, we develop an optimal maneuver and a corresponding controller to achieve stable perching. The maneuver consists of a deceleration and a rapid pitch-up (vertical turn), which arises from analytically solving the optimization problem of minimal velocity at perch, subject to kinematic and dynamic constraints. The controller for the flapping frequency and tail symmetric deflection is nonlinear and adaptive, ensuring robustly stable perching. Indeed, such adaptive behavior in a sense incorporates homeostatic principles of cybernetics into the control system, enhancing the robot's ability to adapt to unexpected disturbances and maintain a stable posture during the perching maneuver. The resulting autonomous perching maneuvers -- closed-loop descent and turn -- , have been verified and validated, demonstrating excellent agreement with real bird perching trajectories reported in the literature. These findings lay the theoretical groundwork for the development of future prototypes that better imitate the skillful perching maneuvers of birds.


INJONGO: A Multicultural Intent Detection and Slot-filling Dataset for 16 African Languages

arXiv.org Artificial Intelligence

Slot-filling and intent detection are well-established tasks in Conversational AI. However, current large-scale benchmarks for these tasks often exclude evaluations of low-resource languages and rely on translations from English benchmarks, thereby predominantly reflecting Western-centric concepts. In this paper, we introduce Injongo -- a multicultural, open-source benchmark dataset for 16 African languages with utterances generated by native speakers across diverse domains, including banking, travel, home, and dining. Through extensive experiments, we benchmark the fine-tuning multilingual transformer models and the prompting large language models (LLMs), and show the advantage of leveraging African-cultural utterances over Western-centric utterances for improving cross-lingual transfer from the English language. Experimental results reveal that current LLMs struggle with the slot-filling task, with GPT-4o achieving an average performance of 26 F1-score. In contrast, intent detection performance is notably better, with an average accuracy of 70.6%, though it still falls behind the fine-tuning baselines. Compared to the English language, GPT-4o and fine-tuning baselines perform similarly on intent detection, achieving an accuracy of approximately 81%. Our findings suggest that the performance of LLMs is still behind for many low-resource African languages, and more work is needed to further improve their downstream performance.


Relating Answer Set Programming and Many-sorted Logics for Formal Verification

arXiv.org Artificial Intelligence

Answer Set Programming (ASP) is an important logic programming paradigm within the field of Knowledge Representation and Reasoning. As a concise, human-readable, declarative language, ASP is an excellent tool for developing trustworthy (especially, artificially intelligent) software systems. However, formally verifying ASP programs offers some unique challenges, such as 1. a lack of modularity (the meanings of rules are difficult to define in isolation from the enclosing program), 2. the ground-and-solve semantics (the meanings of rules are dependent on the input data with which the program is grounded), and 3. limitations of existing tools. My research agenda has been focused on addressing these three issues with the intention of making ASP verification an accessible, routine task that is regularly performed alongside program development. In this vein, I have investigated alternative semantics for ASP based on translations into the logic of here-and-there and many-sorted first-order logic. These semantics promote a modular understanding of logic programs, bypass grounding, and enable us to use automated theorem provers to automatically verify properties of programs.