Search
On the Detection of Reviewer-Author Collusion Rings From Paper Bidding
Jecmen, Steven, Shah, Nihar B., Fang, Fei, Akoglu, Leman
A major threat to the peer-review systems of computer science conferences is the existence of "collusion rings" between reviewers. In such collusion rings, reviewers who have also submitted their own papers to the conference work together to manipulate the conference's paper assignment, with the aim of being assigned to review each other's papers. The most straightforward way that colluding reviewers can manipulate the paper assignment is by indicating their interest in each other's papers through strategic paper bidding. One potential approach to solve this important problem would be to detect the colluding reviewers from their manipulated bids, after which the conference can take appropriate action. While prior work has has developed effective techniques to detect other kinds of fraud, no research has yet established that detecting collusion rings is even possible. In this work, we tackle the question of whether it is feasible to detect collusion rings from the paper bidding. To answer this question, we conduct empirical analysis of two realistic conference bidding datasets, including evaluations of existing algorithms for fraud detection in other applications. We find that collusion rings can achieve considerable success at manipulating the paper assignment while remaining hidden from detection: for example, in one dataset, undetected colluders are able to achieve assignment to up to 30% of the papers authored by other colluders. In addition, when 10 colluders bid on all of each other's papers, no detection algorithm outputs a group of reviewers with more than 31% overlap with the true colluders. These results suggest that collusion cannot be effectively detected from the bidding, demonstrating the need to develop more complex detection algorithms that leverage additional metadata.
Understanding fitness landscapes in morpho-evolution via local optima networks
Thomson, Sarah L., Goff, Léni K. Le, Hart, Emma, Buchanan, Edgar
Morpho-evolution (ME) refers to the simultaneous optimisation of a robot's design and controller to maximise performance given a task and environment. Many genetic encodings have been proposed which are capable of representing design and control. Previous research has provided empirical comparisons between encodings in terms of their performance with respect to an objective function and the diversity of designs that are evaluated, however there has been no attempt to explain the observed findings. We address this by applying Local Optima Network (LON) analysis to investigate the structure of the fitness landscapes induced by three different encodings when evolving a robot for a locomotion task, shedding new light on the ease by which different fitness landscapes can be traversed by a search process. This is the first time LON analysis has been applied in the field of ME despite its popularity in combinatorial optimisation domains; the findings will facilitate design of new algorithms or operators that are customised to ME landscapes in the future.
On the Transit Obfuscation Problem
Takahashi, Hideaki, Fukunaga, Alex
Concealing an intermediate point on a route or visible from a route is an important goal in some transportation and surveillance scenarios. This paper studies the Transit Obfuscation Problem, the problem of traveling from some start location to an end location while "covering" a specific transit point that needs to be concealed from adversaries. We propose the notion of transit anonymity, a quantitative guarantee of the anonymity of a specific transit point, even with a powerful adversary with full knowledge of the path planning algorithm. We propose and evaluate planning/search algorithms that satisfy this anonymity criterion.
Bootstrapping Developmental AIs: From Simple Competences to Intelligent Human-Compatible AIs
Developmental AI is a bootstrapping approach where embodied AIs start with innate competences and learn by interacting with the world. They develop abilities in small steps along a bio-inspired trajectory. However, developmental AIs have not yet reached the abilities of young children. In contrast, mainstream approaches for creating AIs have led to valuable AI systems and impressive feats. These approaches include deep learning and generative approaches (e.g., large language models) and manually constructed symbolic approaches. Manually constructed AIs are brittle even in circumscribed domains. Generative AIs are helpful on average, but they can make strange mistakes and not notice them. They sometimes lack common sense and social alignment. This position paper lays out prospects, gaps, and challenges for augmenting AI mainstream approaches with developmental AI. The ambition is to create data-rich experientially based foundation models and human-compatible, resilient, and trustworthy AIs. This research aims to produce AIs that learn to communicate, establish common ground, read critically, consider the provenance of information, test hypotheses, and collaborate. A virtuous multidisciplinary research cycle has led to developmental AIs with capabilities for multimodal perception, object recognition, and manipulation. Computational models for hierarchical planning, abstraction discovery, curiosity, and language acquisition exist but need to be adapted to an embodied learning approach. They need to bridge competence gaps involving nonverbal communication, speech, reading, and writing. Aspirationally, developmental AIs would learn, share what they learn, and collaborate to achieve high standards. The approach would make the creation of AIs more democratic, enabling more people to train, test, build on, and replicate AIs.
Rethinking the Capacity of Graph Neural Networks for Branching Strategy
Chen, Ziang, Liu, Jialin, Chen, Xiaohan, Wang, Xinshang, Yin, Wotao
Graph neural networks (GNNs) have been widely used to predict properties and heuristics of mixed-integer linear programs (MILPs) and hence accelerate MILP solvers. This paper investigates the capacity of GNNs to represent strong branching (SB) scores that provide an efficient strategy in the branch-and-bound algorithm. Although message-passing GNN (MP-GNN), as the simplest GNN structure, is frequently employed in the existing literature to learn SB scores, we prove a fundamental limitation in its expressive power -- there exist two MILP instances with different SB scores that cannot be distinguished by any MP-GNN, regardless of the number of parameters. In addition, we establish a universal approximation theorem for another GNN structure called the second-order folklore GNN (2-FGNN). We show that for any data distribution over MILPs, there always exists a 2-FGNN that can approximate the SB score with arbitrarily high accuracy and arbitrarily high probability. A small-scale numerical experiment is conducted to directly validate our theoretical findings.
A Functional Analysis Approach to Symbolic Regression
Antonov, Kirill, Kalkreuth, Roman, Yang, Kaifeng, Bäck, Thomas, van Stein, Niki, Kononova, Anna V
Symbolic regression (SR) poses a significant challenge for randomized search heuristics due to its reliance on the synthesis of expressions for input-output mappings. Although traditional genetic programming (GP) algorithms have achieved success in various domains, they exhibit limited performance when tree-based representations are used for SR. To address these limitations, we introduce a novel SR approach called Fourier Tree Growing (FTG) that draws insights from functional analysis. This new perspective enables us to perform optimization directly in a different space, thus avoiding intricate symbolic expressions. Our proposed algorithm exhibits significant performance improvements over traditional GP methods on a range of classical one-dimensional benchmarking problems. To identify and explain limiting factors of GP and FTG, we perform experiments on a large-scale polynomials benchmark with high-order polynomials up to degree 100. To the best of the authors' knowledge, this work represents the pioneering application of functional analysis in addressing SR problems. The superior performance of the proposed algorithm and insights into the limitations of GP open the way for further advancing GP for SR and related areas of explainable machine learning.
Adaptive multi-gradient methods for quasiconvex vector optimization and applications to multi-task learning
Minh, Nguyen Anh, Muu, Le Dung, Thang, Tran Ngoc
We present an adaptive step-size method, which does not include line-search techniques, for solving a wide class of nonconvex multiobjective programming problems on an unbounded constraint set. We also prove convergence of a general approach under modest assumptions. More specifically, the convexity criterion might not be satisfied by the objective function. Unlike descent line-search algorithms, it does not require an initial step-size to be determined by a previously determined Lipschitz constant. The process's primary characteristic is its gradual step-size reduction up until a predetermined condition is met. It can be specifically applied to offer an innovative multi-gradient projection method for unbounded constrained optimization issues. Preliminary findings from a few computational examples confirm the accuracy of the strategy. We apply the proposed technique to some multi-task learning experiments to show its efficacy for large-scale challenges.
Moco: A Learnable Meta Optimizer for Combinatorial Optimization
Dernedde, Tim, Thyssens, Daniela, Dittrich, Sören, Stubbemann, Maximilian, Schmidt-Thieme, Lars
Relevant combinatorial optimization problems (COPs) are often NP-hard. While they have been tackled mainly via handcrafted heuristics in the past, advances in neural networks have motivated the development of general methods to learn heuristics from data. Many approaches utilize a neural network to directly construct a solution, but are limited in further improving based on already constructed solutions at inference time. Our approach, Moco, learns a graph neural network that updates the solution construction procedure based on features extracted from the current search state. This meta training procedure targets the overall best solution found during the search procedure given information such as the search budget. This allows Moco to adapt to varying circumstances such as different computational budgets. Moco is a fully learnable meta optimizer that does not utilize any problem specific local search or decomposition. We test Moco on the Traveling Salesman Problem (TSP) and Maximum Independent Set (MIS) and show that it outperforms other approaches on MIS and is overall competitive on the TSP, especially outperforming related approaches, partially even if they use additional local search.
How Much is Unseen Depends Chiefly on Information About the Seen
It might seem counter-intuitive at first: We find that, in expectation, the proportion of data points in an unknown population-that belong to classes that do not appear in the training data-is almost entirely determined by the number $f_k$ of classes that do appear in the training data the same number of times. While in theory we show that the difference of the induced estimator decays exponentially in the size of the sample, in practice the high variance prevents us from using it directly for an estimator of the sample coverage. However, our precise characterization of the dependency between $f_k$'s induces a large search space of different representations of the expected value, which can be deterministically instantiated as estimators. Hence, we turn to optimization and develop a genetic algorithm that, given only the sample, searches for an estimator with minimal mean-squared error (MSE). In our experiments, our genetic algorithm discovers estimators that have a substantially smaller MSE than the state-of-the-art Good-Turing estimator. This holds for over 96% of runs when there are at least as many samples as classes. Our estimators' MSE is roughly 80% of the Good-Turing estimator's.
Machine Learning Augmented Branch and Bound for Mixed Integer Linear Programming
Scavuzzo, Lara, Aardal, Karen, Lodi, Andrea, Yorke-Smith, Neil
Mixed Integer Linear Programming (MILP) is a pillar of mathematical optimization that offers a powerful modeling language for a wide range of applications. During the past decades, enormous algorithmic progress has been made in solving MILPs, and many commercial and academic software packages exist. Nevertheless, the availability of data, both from problem instances and from solvers, and the desire to solve new problems and larger (real-life) instances, trigger the need for continuing algorithmic development. MILP solvers use branch and bound as their main component. In recent years, there has been an explosive development in the use of machine learning algorithms for enhancing all main tasks involved in the branch-and-bound algorithm, such as primal heuristics, branching, cutting planes, node selection and solver configuration decisions. This paper presents a survey of such approaches, addressing the vision of integration of machine learning and mathematical optimization as complementary technologies, and how this integration can benefit MILP solving. In particular, we give detailed attention to machine learning algorithms that automatically optimize some metric of branch-and-bound efficiency. We also address how to represent MILPs in the context of applying learning algorithms, MILP benchmarks and software.