Goto

Collaborating Authors

 Search


Structure and Reduction of MCTS for Explainable-AI

arXiv.org Artificial Intelligence

Complex sequential decision-making planning problems, covering infinite states' space have been shown to be solvable by AlphaZero type of algorithms. Such an approach that trains a neural model while simulating projection of futures with a Monte Carlo Tree Search algorithm were shown to be applicable to real life planning problems. As such, engineers and users interacting with the resulting policy of behavior might benefit from obtaining automated explanations about these planners' decisions offline or online. This paper focuses on the information within the Monte Carlo Tree Search data structure. Given its construction, this information contains much of the reasoning of the sequential decision-making algorithm and is essential for its explainability. We show novel methods using information theoretic tools for the simplification and reduction of the Monte Carlo Tree Search and the extraction of information. Such information can be directly used for the construction of human understandable explanations. We show that basic explainability quantities can be calculated with limited additional computational cost, as an integrated part of the Monte Carlo Tree Search construction process. We focus on the theoretical and algorithmic aspects and provide examples of how the methods presented here can be used in the construction of human understandable explanations.


Eliminating Backdoors in Neural Code Models via Trigger Inversion

arXiv.org Artificial Intelligence

Neural code models (NCMs) have been widely used for addressing various code understanding tasks, such as defect detection and clone detection. However, numerous recent studies reveal that such models are vulnerable to backdoor attacks. Backdoored NCMs function normally on normal code snippets, but exhibit adversary-expected behavior on poisoned code snippets injected with the adversary-crafted trigger. It poses a significant security threat. For example, a backdoored defect detection model may misclassify user-submitted defective code as non-defective. If this insecure code is then integrated into critical systems, like autonomous driving systems, it could lead to life safety. However, there is an urgent need for effective defenses against backdoor attacks targeting NCMs. To address this issue, in this paper, we innovatively propose a backdoor defense technique based on trigger inversion, called EliBadCode. EliBadCode first filters the model vocabulary for trigger tokens to reduce the search space for trigger inversion, thereby enhancing the efficiency of the trigger inversion. Then, EliBadCode introduces a sample-specific trigger position identification method, which can reduce the interference of adversarial perturbations for subsequent trigger inversion, thereby producing effective inverted triggers efficiently. Subsequently, EliBadCode employs a Greedy Coordinate Gradient algorithm to optimize the inverted trigger and designs a trigger anchoring method to purify the inverted trigger. Finally, EliBadCode eliminates backdoors through model unlearning. We evaluate the effectiveness of EliBadCode in eliminating backdoor attacks against multiple NCMs used for three safety-critical code understanding tasks. The results demonstrate that EliBadCode can effectively eliminate backdoors while having minimal adverse effects on the normal functionality of the model.


Decomposition Strategies and Multi-shot ASP Solving for Job-shop Scheduling

arXiv.org Artificial Intelligence

The Job-shop Scheduling Problem (JSP) is a well-known and challenging combinatorial optimization problem in which tasks sharing a machine are to be arranged in a sequence such that encompassing jobs can be completed as early as possible. In this paper, we investigate problem decomposition into time windows whose operations can be successively scheduled and optimized by means of multi-shot Answer Set Programming (ASP) solving. From a computational perspective, decomposition aims to split highly complex scheduling tasks into better manageable subproblems with a balanced number of operations such that good-quality or even optimal partial solutions can be reliably found in a small fraction of runtime. We devise and investigate a variety of decomposition strategies in terms of the number and size of time windows as well as heuristics for choosing their operations. Moreover, we incorporate time window overlapping and compression techniques into the iterative scheduling process to counteract optimization limitations due to the restriction to window-wise partial schedules. Our experiments on different JSP benchmark sets show that successive optimization by multi-shot ASP solving leads to substantially better schedules within tight runtime limits than single-shot optimization on the full problem. In particular, we find that decomposing initial solutions obtained with proficient heuristic methods into time windows leads to improved solution quality.


Hierarchical Neural Constructive Solver for Real-world TSP Scenarios

arXiv.org Artificial Intelligence

Existing neural constructive solvers for routing problems have predominantly employed transformer architectures, conceptualizing the route construction as a set-to-sequence learning task. However, their efficacy has primarily been demonstrated on entirely random problem instances that inadequately capture real-world scenarios. In this paper, we introduce realistic Traveling Salesman Problem (TSP) scenarios relevant to industrial settings and derive the following insights: (1) The optimal next node (or city) to visit often lies within proximity to the current node, suggesting the potential benefits of biasing choices based on current locations. (2) Effectively solving the TSP requires robust tracking of unvisited nodes and warrants succinct grouping strategies. Building upon these insights, we propose integrating a learnable choice layer inspired by Hypernetworks to prioritize choices based on the current location, and a learnable approximate clustering algorithm inspired by the Expectation-Maximization algorithm to facilitate grouping the unvisited cities. Together, these two contributions form a hierarchical approach towards solving the realistic TSP by considering both immediate local neighbourhoods and learning an intermediate set of node representations. Our hierarchical approach yields superior performance compared to both classical and recent transformer models, showcasing the efficacy of the key designs.


Convergence Conditions for Stochastic Line Search Based Optimization of Over-parametrized Models

arXiv.org Artificial Intelligence

In this paper, we deal with algorithms to solve the finite-sum problems related to fitting over-parametrized models, that typically satisfy the interpolation condition. In particular, we focus on approaches based on stochastic line searches and employing general search directions. We define conditions on the sequence of search directions that guarantee finite termination and bounds for the backtracking procedure. Moreover, we shed light on the additional property of directions needed to prove fast (linear) convergence of the general class of algorithms when applied to PL functions in the interpolation regime. From the point of view of algorithms design, the proposed analysis identifies safeguarding conditions that could be employed in relevant algorithmic framework. In particular, it could be of interest to integrate stochastic line searches within momentum, conjugate gradient or adaptive preconditioning methods.


Reinforcement learning-based architecture search for quantum machine learning

arXiv.org Artificial Intelligence

Quantum machine learning models use encoding circuits to map data into a quantum Hilbert space. While it is well known that the architecture of these circuits significantly influences core properties of the resulting model, they are often chosen heuristically. In this work, we present a novel approach using reinforcement learning techniques to generate problem-specific encoding circuits to improve the performance of quantum machine learning models. By specifically using a model-based reinforcement learning algorithm, we reduce the number of necessary circuit evaluations during the search, providing a sample-efficient framework. In contrast to previous search algorithms, our method uses a layered circuit structure that significantly reduces the search space. Additionally, our approach can account for multiple objectives such as solution quality, hardware restrictions and circuit depth. We benchmark our tailored circuits against various reference models, including models with problem-agnostic circuits and classical models. Our results highlight the effectiveness of problem-specific encoding circuits in enhancing QML model performance.


Perfect Information Monte Carlo with Postponing Reasoning

arXiv.org Artificial Intelligence

Imperfect information games, such as Bridge and Skat, present challenges due to state-space explosion and hidden information, posing formidable obstacles for search algorithms. Determinization-based algorithms offer a resolution by sampling hidden information and solving the game in a perfect information setting, facilitating rapid and effective action estimation. However, transitioning to perfect information introduces challenges, notably one called strategy fusion.This research introduces `Extended Perfect Information Monte Carlo' (EPIMC), an online algorithm inspired by the state-of-the-art determinization-based approach Perfect Information Monte Carlo (PIMC). EPIMC enhances the capabilities of PIMC by postponing the perfect information resolution, reducing alleviating issues related to strategy fusion. However, the decision to postpone the leaf evaluator introduces novel considerations, such as the interplay between prior levels of reasoning and the newly deferred resolution. In our empirical analysis, we investigate the performance of EPIMC across a range of games, with a particular focus on those characterized by varying degrees of strategy fusion. Our results demonstrate notable performance enhancements, particularly in games where strategy fusion significantly impacts gameplay. Furthermore, our research contributes to the theoretical foundation of determinization-based algorithms addressing challenges associated with strategy fusion.%, thereby enhancing our understanding of these algorithms within the context of imperfect information game scenarios.


Optimal and efficient text counterfactuals using Graph Neural Networks

arXiv.org Artificial Intelligence

As NLP models become increasingly integral to decision-making processes, the need for explainability and interpretability has become paramount. In this work, we propose a framework that achieves the aforementioned by generating semantically edited inputs, known as counterfactual interventions, which change the model prediction, thus providing a form of counterfactual explanations for the model. We test our framework on two NLP tasks - binary sentiment classification and topic classification - and show that the generated edits are contrastive, fluent and minimal, while the whole process remains significantly faster that other state-of-the-art counterfactual editors.


MARCO: A Memory-Augmented Reinforcement Framework for Combinatorial Optimization

arXiv.org Artificial Intelligence

Neural Combinatorial Optimization (NCO) is an emerging domain where deep learning techniques are employed to address combinatorial optimization problems as a standalone solver. Despite their potential, existing NCO methods often suffer from inefficient search space exploration, frequently leading to local optima entrapment or redundant exploration of previously visited states. This paper introduces a versatile framework, referred to as Memory-Augmented Reinforcement for Combinatorial Optimization (MARCO), that can be used to enhance both constructive and improvement methods in NCO through an innovative memory module. MARCO stores data collected throughout the optimization trajectory and retrieves contextually relevant information at each state. This way, the search is guided by two competing criteria: making the best decision in terms of the quality of the solution and avoiding revisiting already explored solutions. This approach promotes a more efficient use of the available optimization budget. Moreover, thanks to the parallel nature of NCO models, several search threads can run simultaneously, all sharing the same memory module, enabling an efficient collaborative exploration. Empirical evaluations, carried out on the maximum cut, maximum independent set and travelling salesman problems, reveal that the memory module effectively increases the exploration, enabling the model to discover diverse, higher-quality solutions. MARCO achieves good performance in a low computational cost, establishing a promising new direction in the field of NCO.


Jacta: A Versatile Planner for Learning Dexterous and Whole-body Manipulation

arXiv.org Artificial Intelligence

Robotic manipulation is challenging due to discontinuous dynamics, as well as high-dimensional state and action spaces. Data-driven approaches that succeed in manipulation tasks require large amounts of data and expert demonstrations, typically from humans. Existing manipulation planners are restricted to specific systems and often depend on specialized algorithms for using demonstration. Therefore, we introduce a flexible motion planner tailored to dexterous and whole-body manipulation tasks. Our planner creates readily usable demonstrations for reinforcement learning algorithms, eliminating the need for additional training pipeline complexities. With this approach, we can efficiently learn policies for complex manipulation tasks, where traditional reinforcement learning alone only makes little progress. Furthermore, we demonstrate that learned policies are transferable to real robotic systems for solving complex dexterous manipulation tasks.