Overview
A Survey on Knowledge Graph Structure and Knowledge Graph Embeddings
Sardina, Jeffrey, Kelleher, John D., O'Sullivan, Declan
Knowledge Graphs (KGs) and their machine learning counterpart, Knowledge Graph Embedding Models (KGEMs), have seen ever-increasing use in a wide variety of academic and applied settings. In particular, KGEMs are typically applied to KGs to solve the link prediction task; i.e. to predict new facts in the domain of a KG based on existing, observed facts. While this approach has been shown substantial power in many end-use cases, it remains incompletely characterised in terms of how KGEMs react differently to KG structure. This is of particular concern in light of recent studies showing that KG structure can be a significant source of bias as well as partially determinant of overall KGEM performance. This paper seeks to address this gap in the state-of-the-art. This paper provides, to the authors' knowledge, the first comprehensive survey exploring established relationships of Knowledge Graph Embedding Models and Graph structure in the literature. It is the hope of the authors that this work will inspire further studies in this area, and contribute to a more holistic understanding of KGs, KGEMs, and the link prediction task.
AI in the Cosmos
Artificial intelligence (AI) is revolutionizing research by enabling the efficient analysis of large datasets and the discovery of hidden patterns. In astrophysics, AI has become essential, transforming the classification of celestial sources, data modeling, and the interpretation of observations. In this review, I highlight examples of AI applications in astrophysics, including source classification, spectral energy distribution modeling, and discuss the advancements achievable through generative AI. However, the use of AI introduces challenges, including biases, errors, and the "black box" nature of AI models, which must be resolved before their application. These issues can be addressed through the concept of Human-Guided AI (HG-AI), which integrates human expertise and domain-specific knowledge into AI applications. This approach aims to ensure that AI is applied in a robust, interpretable, and ethical manner, leading to deeper insights and fostering scientific excellence.
A Review of Fairness and A Practical Guide to Selecting Context-Appropriate Fairness Metrics in Machine Learning
Barr, Caleb J. S., Erdelyi, Olivia, Docherty, Paul D., Grace, Randolph C.
Recent regulatory proposals for artificial intelligence emphasize fairness requirements for machine learning models. However, precisely defining the appropriate measure of fairness is challenging due to philosophical, cultural and political contexts. Biases can infiltrate machine learning models in complex ways depending on the model's context, rendering a single common metric of fairness insufficient. This ambiguity highlights the need for criteria to guide the selection of context-aware measures, an issue of increasing importance given the proliferation of ever tighter regulatory requirements. To address this, we developed a flowchart to guide the selection of contextually appropriate fairness measures. Twelve criteria were used to formulate the flowchart. This included consideration of model assessment criteria, model selection criteria, and data bias. We also review fairness literature in the context of machine learning and link it to core regulatory instruments to assist policymakers, AI developers, researchers, and other stakeholders in appropriately addressing fairness concerns and complying with relevant regulatory requirements.
MalMixer: Few-Shot Malware Classification with Retrieval-Augmented Semi-Supervised Learning
Li, Jiliang, Zhang, Yifan, Huang, Yu, Leach, Kevin
Recent growth and proliferation of malware has tested practitioners' ability to promptly classify new samples according to malware families. In contrast to labor-intensive reverse engineering efforts, machine learning approaches have demonstrated increased speed and accuracy. However, most existing deep-learning malware family classifiers must be calibrated using a large number of samples that are painstakingly manually analyzed before training. Furthermore, as novel malware samples arise that are beyond the scope of the training set, additional reverse engineering effort must be employed to update the training set. The sheer volume of new samples found in the wild creates substantial pressure on practitioners' ability to reverse engineer enough malware to adequately train modern classifiers. In this paper, we present MalMixer, a malware family classifier using semi-supervised learning that achieves high accuracy with sparse training data. We present a novel domain-knowledge-aware technique for augmenting malware feature representations, enhancing few-shot performance of semi-supervised malware family classification. We show that MalMixer achieves state-of-the-art performance in few-shot malware family classification settings. Our research confirms the feasibility and effectiveness of lightweight, domain-knowledge-aware feature augmentation methods and highlights the capabilities of similar semi-supervised classifiers in addressing malware classification issues.
Solving Robust Markov Decision Processes: Generic, Reliable, Efficient
Meggendorfer, Tobias, Weininger, Maximilian, Wienhöft, Patrick
Markov decision processes (MDP) are a well-established model for sequential decision-making in the presence of probabilities. In robust MDP (RMDP), every action is associated with an uncertainty set of probability distributions, modelling that transition probabilities are not known precisely. Based on the known theoretical connection to stochastic games, we provide a framework for solving RMDPs that is generic, reliable, and efficient. It is *generic* both with respect to the model, allowing for a wide range of uncertainty sets, including but not limited to intervals, $L^1$- or $L^2$-balls, and polytopes; and with respect to the objective, including long-run average reward, undiscounted total reward, and stochastic shortest path. It is *reliable*, as our approach not only converges in the limit, but provides precision guarantees at any time during the computation. It is *efficient* because -- in contrast to state-of-the-art approaches -- it avoids explicitly constructing the underlying stochastic game. Consequently, our prototype implementation outperforms existing tools by several orders of magnitude and can solve RMDPs with a million states in under a minute.
A systematic review of norm emergence in multi-agent systems
Cordova, Carmengelys, Taverner, Joaquin, Del Val, Elena, Argente, Estefania
Multi-agent systems (MAS) have gained relevance in the field of artificial intelligence by offering tools for modelling complex environments where autonomous agents interact to achieve common or individual goals. In these systems, norms emerge as a fundamental component to regulate the behaviour of agents, promoting cooperation, coordination and conflict resolution. This article presents a systematic review, following the PRISMA method, on the emergence of norms in MAS, exploring the main mechanisms and factors that influence this process. Sociological, structural, emotional and cognitive aspects that facilitate the creation, propagation and reinforcement of norms are addressed. The findings highlight the crucial role of social network topology, as well as the importance of emotions and shared values in the adoption and maintenance of norms. Furthermore, opportunities are identified for future research that more explicitly integrates emotional and ethical dynamics in the design of adaptive normative systems. This work provides a comprehensive overview of the current state of research on norm emergence in MAS, serving as a basis for advancing the development of more efficient and flexible systems in artificial and real-world contexts.
Pre-Deployment Information Sharing: A Zoning Taxonomy for Precursory Capabilities
Pistillo, Matteo, Stix, Charlotte
There is a growing consensus that information is the "lifeblood of good governance" (Kolt et al., 2024) and that information sharing should be one of the "natural initial target[s]" of AI governance (Bommasani et al., 2024). Up-to-date and reliable information about AI systems' capabilities and how capabilities will develop in the future can help developers, governments, and researchers advance safety evaluations (Frontier Model Forum, 2024), develop best practices (UK DSIT, 2023), and respond effectively to the new risks posed by frontier AI (Kolt et al., 2024). Information sharing also supports regulatory visibility (Anderljung et al., 2023) and can thus enable better-informed AI governance (O'Brien et al., 2024). Further, access to knowledge about AI systems' potential risks allows AI systems claims to be scrutinized more effectively (Brundage et al., 2020). By contrast, information asymmetries could lead regulators to miscalibrated over-regulation--or under-regulation--of AI (Ball & Kokotajlo, 2024) and could contribute to the "pacing problem," a situation in which government oversight consistently lags behind technology development (Marchant et al., 2011). In short, there is a strong case for information sharing being one "key to making AI go well" (Ball & Kokotajlo, 2024). The Frontier AI Safety Commitments ("FAISC") are an important step towards more comprehensive information sharing by AI developers.
Physics Instrument Design with Reinforcement Learning
Qasim, Shah Rukh, Owen, Patrick, Serra, Nicola
We present a case for the use of Reinforcement Learning (RL) for the design of physics instrument as an alternative to gradient-based instrument-optimization methods. It's applicability is demonstrated using two empirical studies. One is longitudinal segmentation of calorimeters and the second is both transverse segmentation as well longitudinal placement of trackers in a spectrometer. Based on these experiments, we propose an alternative approach that offers unique advantages over differentiable programming and surrogate-based differentiable design optimization methods. First, Reinforcement Learning (RL) algorithms possess inherent exploratory capabilities, which help mitigate the risk of convergence to local optima. Second, this approach eliminates the necessity of constraining the design to a predefined detector model with fixed parameters. Instead, it allows for the flexible placement of a variable number of detector components and facilitates discrete decision-making. We then discuss the road map of how this idea can be extended into designing very complex instruments. The presented study sets the stage for a novel framework in physics instrument design, offering a scalable and efficient framework that can be pivotal for future projects such as the Future Circular Collider (FCC), where most optimized detectors are essential for exploring physics at unprecedented energy scales.
You Name It, I Run It: An LLM Agent to Execute Tests of Arbitrary Projects
Bouzenia, Islem, Pradel, Michael
The ability to execute the test suite of a project is essential in many scenarios, e.g., to assess code quality and code coverage, to validate code changes made by developers or automated tools, and to ensure compatibility with dependencies. Despite its importance, executing the test suite of a project can be challenging in practice because different projects use different programming languages, software ecosystems, build systems, testing frameworks, and other tools. These challenges make it difficult to create a reliable, universal test execution method that works across different projects. This paper presents ExecutionAgent, an automated technique that installs arbitrary projects, configures them to run test cases, and produces project-specific scripts to reproduce the setup. Inspired by the way a human developer would address this task, our approach is a large language model-based agent that autonomously executes commands and interacts with the host system. The agent uses meta-prompting to gather guidelines on the latest technologies related to the given project, and it iteratively refines its process based on feedback from the previous steps. Our evaluation applies ExecutionAgent to 50 open-source projects that use 14 different programming languages and many different build and testing tools. The approach successfully executes the test suites of 33/55 projects, while matching the test results of ground truth test suite executions with a deviation of only 7.5\%. These results improve over the best previously available technique by 6.6x. The costs imposed by the approach are reasonable, with an execution time of 74 minutes and LLM costs of 0.16 dollars, on average per project. We envision ExecutionAgent to serve as a valuable tool for developers, automated programming tools, and researchers that need to execute tests across a wide variety of projects.
Advances in Transformers for Robotic Applications: A Review
Sanghai, Nikunj, Brown, Nik Bear
The introduction of Transformers architecture has brought about significant breakthroughs in Deep Learning (DL), particularly within Natural Language Processing (NLP). Since their inception, Transformers have outperformed many traditional neural network architectures due to their "self-attention" mechanism and their scalability across various applications. In this paper, we cover the use of Transformers in Robotics. We go through recent advances and trends in Transformer architectures and examine their integration into robotic perception, planning, and control for autonomous systems. Furthermore, we review past work and recent research on use of Transformers in Robotics as pre-trained foundation models and integration of Transformers with Deep Reinforcement Learning (DRL) for autonomous systems. We discuss how different Transformer variants are being adapted in robotics for reliable planning and perception, increasing human-robot interaction, long-horizon decision-making, and generalization. Finally, we address limitations and challenges, offering insight and suggestions for future research directions.