Search
Reviews: Higher-Order Total Variation Classes on Grids: Minimax Theory and Trend Filtering Methods
This paper considers graph-structured signal denoising problem. The particular structure enforced involves a total variation type penalty involving higher order discrete derivatives. Optimal rates for penalized least-squares estimator is established and minimax lower bound is established. For the grid filtering case the upper bounds are established under an assumed conjecture. The paper is well written.
Reviews: Combinatorial Optimization with Graph Convolutional Networks and Guided Tree Search
This paper proposes a method combining graph neural networks and guided tree search to tackle combinatorial optimization problems. The authors focus on the Maximum Independent Set (MIS), the Minimum Vertex Cover (MVC), Maximal Clique (MC) and Satisfiability which are all reduced to MIS. Given a MIS instance represented by a graph, the method consists in using the predictions of a graph convolutional neural (GCN) network, which is trained with binary cross-entropy to predict whether a vertex is in the maximum independent set, as input for a greedy search algorithm. The greedy search algorithm considers vertices in descending order of the GCN outputs, adding them to the independent set if doing so doesn't violate the independence assumption. Once this isn't possible, the process is repeated on the residual graph until termination. The method is further augmented with diversity, allowing the GCN to output different probability maps, and a tree search (partial solutions are added to a queue from which they will be randomly sampled to be further constructed).
Reviews: Dynamic Importance Sampling for Anytime Bounds of the Partition Function
The authors present a method for estimating the partition function that alternates between performing heuristic search and importance sampling. The estimated value of the partition function is confidence bounded and improves with additional computation time. Experimental evaluation is performed on problems from 2006 and 2008 UAI competitions by comparing confidence bounds of the proposed method against previous work that uses only sampling [15] or search [16]. The proposed method significantly outperforms sampling on certain problems and search on others, while maintaining performance roughly comparable to or better than either sampling or search across all problems. The originality of the work is a bit limited, as it is a fairly straightforward (but novel as far as I know) combination of two recent papers, references [15] and [16].
Reviews: Minimax Estimation of Bandable Precision Matrices
Summary The paper establishes the first theoretical guarantees for statistical estimation of bandable precision matrices. The authors propose an estimator for the precision matrix given a set of independent observations, and show that this estimator is minimax optimal. Interestingly the minimax rate for estimating banded precision matrices is shown to be equal to the corresponding rate for estimating banded covariance matrices. The upper bound follows by studying the behavior of inverses of small blocks along the diagonal of the covariance matrix together with classical random matrix theory results. The lower bound follows by constructing two specially defined subparameter spaces and applying testing arguments.
Reviews: Single-Agent Policy Tree Search With Guarantees
The paper presents two planning algorithms based on tree search. The novelty of these algorithms is the use of a policy based criterion instead of a standard heuristic to guide the search. The first algorithm (LevinTS) uses a cost function d(n)/\pi(n) which ensures nodes are expanded in a best-first order. This allows the authors to derive an upper bound to the number of expansions performed before to reach a target node. The second algorithm (LubyTS) is a randomized algorithm based on the sampling of trajectories.
Reviews: Optimization of Smooth Functions with Noisy Observations: Local Minimax Rates
In the traditional setting of global optimization, the algorithmic goal is to design an adaptive procedure to find an approximate (in this case, global) optimum of an unknown function f, given access to noisy evaluations at feasible points. This notion of complexity might be too rough to understand the difficulty of the problem, so it is proposed to study the local minimax complexity, when the algorithm may have additional information on how close is the objective to a fixed function f_0. In practice, the algorithm doesn't need to know f_0, but this parameterization serves as an instance-dependent notion of complexity. This paper additionally considers how the regularity of the function (given by its Holder continuity) improves the complexity, as it is well-known that for Lipschitz functions adaptivity does not help. The main contributions of the paper can be summarized as follows: 1.
Reviews: A Smoothed Analysis of the Greedy Algorithm for the Linear Contextual Bandit Problem
This paper is about contextual linear bandits. The authors are trying to understand the phenomenon observed in practice that exploration seems much less important than the theory usually suggests. To make a start understanding this they analyze the greedy policy that chooses the arm for which the inner product with the least squares estimator and the context is largest. With no further assumptions this leads to linear regret. The authors make a'perturbation' assumption where the contexts are first chosen by an adversary and then perturbed using zero-mean noise. Under carefully chosen assumptions on the noise they show that the greedy algorithm now enjoys O(sqrt(T)) regret. The two standard settings are studied.
Optimizing Tensor Computation Graphs with Equality Saturation and Monte Carlo Tree Search
Hartmann, Jakob, He, Guoliang, Yoneki, Eiko
The real-world effectiveness of deep neural networks often depends on their latency, thereby necessitating optimization techniques that can reduce a model's inference time while preserving its performance. One popular approach is to sequentially rewrite the input computation graph into an equivalent but faster one by replacing individual subgraphs. This approach gives rise to the so-called phase-ordering problem in which the application of one rewrite rule can eliminate the possibility to apply an even better one later on. Recent work has shown that equality saturation, a technique from compiler optimization, can mitigate this issue by first building an intermediate representation (IR) that efficiently stores multiple optimized versions of the input program before extracting the best solution in a second step. In practice, however, memory constraints prevent the IR from capturing all optimized versions and thus reintroduce the phase-ordering problem in the construction phase. In this paper, we present a tensor graph rewriting approach that uses Monte Carlo tree search to build superior IRs by identifying the most promising rewrite rules. We also introduce a novel extraction algorithm that can provide fast and accurate runtime estimates of tensor programs represented in an IR. Our approach improves the inference speedup of neural networks by up to 11% compared to existing methods.
AlphaRouter: Quantum Circuit Routing with Reinforcement Learning and Tree Search
Tang, Wei, Duan, Yiheng, Kharkov, Yaroslav, Fakoor, Rasool, Kessler, Eric, Shi, Yunong
Quantum computers have the potential to outperform classical computers in important tasks such as optimization and number factoring. They are characterized by limited connectivity, which necessitates the routing of their computational bits, known as qubits, to specific locations during program execution to carry out quantum operations. Traditionally, the NP-hard optimization problem of minimizing the routing overhead has been addressed through sub-optimal rule-based routing techniques with inherent human biases embedded within the cost function design. This paper introduces a solution that integrates Monte Carlo Tree Search (MCTS) with Reinforcement Learning (RL). Our RL-based router, called AlphaRouter, outperforms the current state-of-the-art routing methods and generates quantum programs with up to $20\%$ less routing overhead, thus significantly enhancing the overall efficiency and feasibility of quantum computing.
Collaboration! Towards Robust Neural Methods for Routing Problems
Zhou, Jianan, Wu, Yaoxin, Cao, Zhiguang, Song, Wen, Zhang, Jie, Shen, Zhiqi
Despite enjoying desirable efficiency and reduced reliance on domain expertise, existing neural methods for vehicle routing problems (VRPs) suffer from severe robustness issues -- their performance significantly deteriorates on clean instances with crafted perturbations. To enhance robustness, we propose an ensemble-based Collaborative Neural Framework (CNF) w.r.t. the defense of neural VRP methods, which is crucial yet underexplored in the literature. Given a neural VRP method, we adversarially train multiple models in a collaborative manner to synergistically promote robustness against attacks, while boosting standard generalization on clean instances. A neural router is designed to adeptly distribute training instances among models, enhancing overall load balancing and collaborative efficacy. Extensive experiments verify the effectiveness and versatility of CNF in defending against various attacks across different neural VRP methods. Notably, our approach also achieves impressive out-of-distribution generalization on benchmark instances.