Search
Nonconvex Matrix Factorization is Geodesically Convex: Global Landscape Analysis for Fixed-rank Matrix Optimization From a Riemannian Perspective
Luo, Yuetian, Trillos, Nicolas Garcia
We study a general matrix optimization problem with a fixed-rank positive semidefinite (PSD) constraint. We perform the Burer-Monteiro factorization and consider a particular Riemannian quotient geometry in a search space that has a total space equipped with the Euclidean metric. When the original objective f satisfies standard restricted strong convexity and smoothness properties, we characterize the global landscape of the factorized objective under the Riemannian quotient geometry. We show the entire search space can be divided into three regions: (R1) the region near the target parameter of interest, where the factorized objective is geodesically strongly convex and smooth; (R2) the region containing neighborhoods of all strict saddle points; (R3) the remaining regions, where the factorized objective has a large gradient. To our best knowledge, this is the first global landscape analysis of the Burer-Monteiro factorized objective under the Riemannian quotient geometry. Our results provide a fully geometric explanation for the superior performance of vanilla gradient descent under the Burer-Monteiro factorization. When f satisfies a weaker restricted strict convexity property, we show there exists a neighborhood near local minimizers such that the factorized objective is geodesically convex. To prove our results we provide a comprehensive landscape analysis of a matrix factorization problem with a least squares objective, which serves as a critical bridge. Our conclusions are also based on a result of independent interest stating that the geodesic ball centered at Y with a radius 1/3 of the least singular value of Y is a geodesically convex set under the Riemannian quotient geometry, which as a corollary, also implies a quantitative bound of the convexity radius in the Bures-Wasserstein space. The convexity radius obtained is sharp up to constants.
Incorporating Multi-armed Bandit with Local Search for MaxSAT
Zheng, Jiongzhi, He, Kun, Zhou, Jianrong, Jin, Yan, Li, Chu-Min, Manyà, Felip
As an optimization extension of the famous Boolean Satisfiability (SAT) decision problem, the Maximum Satisfiability (MaxSAT) problem aims at finding a complete assignment of the Boolean variables to satisfy as many clauses as possible in a given propositional formula in Conjunctive Normal Form (CNF) [1]. Partial MaxSAT (PMS) is a variant of MaxSAT where the clauses are divided into hard and soft. PMS aims at maximizing the number of satisfied soft clauses with the constraint that all the hard clauses must be satisfied. Associating a positive weight to each soft clause in PMS results in Weighted PMS (WPMS), whose goal is to maximize the total weight of satisfied soft clauses with the same constraint of PMS that all the hard clauses must be satisfied. Both PMS and WPMS, denoted as (W)PMS, have many practical applications such as planning [2], combinatorial testing [3], group testing [4], timetabling [5], etc. Existing solvers for (W)PMS can be divided into complete and incomplete according to whether their solutions have optimality guarantees.
Encoder-Decoder Model for Suffix Prediction in Predictive Monitoring
Rama-Maneiro, Efrén, Monteagudo-Lago, Pablo, Vidal, Juan C., Lama, Manuel
Predictive monitoring is a subfield of process mining that aims to predict how a running case will unfold in the future. One of its main challenges is forecasting the sequence of activities that will occur from a given point in time -- suffix prediction -- . Most approaches to the suffix prediction problem learn to predict the suffix by learning how to predict the next activity only, not learning from the whole suffix during the training phase. This paper proposes a novel architecture based on an encoder-decoder model with an attention mechanism that decouples the representation learning of the prefixes from the inference phase, predicting only the activities of the suffix. During the inference phase, this architecture is extended with a heuristic search algorithm that improves the selection of the activity for each index of the suffix. Our approach has been tested using 12 public event logs against 6 different state-of-the-art proposals, showing that it significantly outperforms these proposals.
Peano: Learning Formal Mathematical Reasoning
Poesia, Gabriel, Goodman, Noah D.
General mathematical reasoning is computationally undecidable, but humans routinely solve new problems. Moreover, discoveries developed over centuries are taught to subsequent generations quickly. What structure enables this, and how might that inform automated mathematical reasoning? We posit that central to both puzzles is the structure of procedural abstractions underlying mathematics. We explore this idea in a case study on 5 sections of beginning algebra on the Khan Academy platform. To define a computational foundation, we introduce Peano, a theorem-proving environment where the set of valid actions at any point is finite. We use Peano to formalize introductory algebra problems and axioms, obtaining well-defined search problems. We observe existing reinforcement learning methods for symbolic reasoning to be insufficient to solve harder problems. Adding the ability to induce reusable abstractions ("tactics") from its own solutions allows an agent to make steady progress, solving all problems. Furthermore, these abstractions induce an order to the problems, seen at random during training. The recovered order has significant agreement with the expert-designed Khan Academy curriculum, and second-generation agents trained on the recovered curriculum learn significantly faster. These results illustrate the synergistic role of abstractions and curricula in the cultural transmission of mathematics.
Minimax AUC Fairness: Efficient Algorithm with Provable Convergence
Yang, Zhenhuan, Ko, Yan Lok, Varshney, Kush R., Ying, Yiming
The use of machine learning models in consequential decision making often exacerbates societal inequity, in particular yielding disparate impact on members of marginalized groups defined by race and gender. The area under the ROC curve (AUC) is widely used to evaluate the performance of a scoring function in machine learning, but is studied in algorithmic fairness less than other performance metrics. Due to the pairwise nature of the AUC, defining an AUC-based group fairness metric is pairwise-dependent and may involve both \emph{intra-group} and \emph{inter-group} AUCs. Importantly, considering only one category of AUCs is not sufficient to mitigate unfairness in AUC optimization. In this paper, we propose a minimax learning and bias mitigation framework that incorporates both intra-group and inter-group AUCs while maintaining utility. Based on this Rawlsian framework, we design an efficient stochastic optimization algorithm and prove its convergence to the minimum group-level AUC. We conduct numerical experiments on both synthetic and real-world datasets to validate the effectiveness of the minimax framework and the proposed optimization algorithm.
GraphPNAS: Learning Distribution of Good Neural Architectures via Deep Graph Generative Models
Li, Muchen, Liu, Jeffrey Yunfan, Sigal, Leonid, Liao, Renjie
Neural architectures can be naturally viewed as computational graphs. Motivated by this perspective, we, in this paper, study neural architecture search (NAS) through the lens of learning random graph models. In contrast to existing NAS methods which largely focus on searching for a single best architecture, i.e, point estimation, we propose GraphPNAS a deep graph generative model that learns a distribution of well-performing architectures. Relying on graph neural networks (GNNs), our GraphPNAS can better capture topologies of good neural architectures and relations between operators therein. Moreover, our graph generator leads to a learnable probabilistic search method that is more flexible and efficient than the commonly used RNN generator and random search methods. Finally, we learn our generator via an efficient reinforcement learning formulation for NAS. To assess the effectiveness of our GraphPNAS, we conduct extensive experiments on three search spaces, including the challenging RandWire on TinyImageNet, ENAS on CIFAR10, and NAS-Bench-101/201. The complexity of RandWire is significantly larger than other search spaces in the literature. We show that our proposed graph generator consistently outperforms RNN-based one and achieves better or comparable performances than state-of-the-art NAS methods.
Learning to design without prior data: Discovering generalizable design strategies using deep learning and tree search
Raina, Ayush, Cagan, Jonathan, McComb, Christopher
ABSTRACT Building an AI agent that can design on its own has been a goal since the 1980s. Recently, deep learning has shown the ability to learn from large-scale data, enabling significant advances in data-driven design. However, learning over prior data limits us only to solve problems that have been solved before and biases data-driven learning towards existing solutions. The ultimate goal for a design agent is the ability to learn generalizable design behavior in a problem space without having seen it before. We introduce a self-learning agent framework in this work that achieves this goal. This framework integrates a deep policy network with a novel tree search algorithm, where the tree search explores the problem space, and the deep policy network leverages self-generated experience to guide the search further. This framework first demonstrates an ability to discover high-performing generative strategies without any prior data, and second, it illustrates a zero-shot generalization of generative strategies across various unseen boundary conditions. This work evaluates the effectiveness and versatility of the framework by solving multiple versions of two engineering design problems without retraining. Overall, this paper presents a methodology to self-learn high-performing and generalizable problem-solving behavior in an arbitrary problem space, circumventing the needs for expert data, existing solutions, and problem-specific learning. Published in ASME Journal of Mechanical Design. Published online November 11 2022. INTRODUCTION: Solving design problems is one of the most ubiquitous processes in engineering and arguably the most challenging [1,2]. The design automation research paradigm aims to augment the continually evolving design solving process by enabling machines to engage in design. Despite decades of research in the area, modern-day automated design synthesis is still heavily guided by handcrafted rules and prior expert data, making it susceptible to non-generalizability and errors resulting from human bias [3,4]. Developing a design agent that can learn from scratch is still a long-standing challenge. This paper addresses this challenge by introducing a generalizable design agent framework that integrates newly developed tree search and deep learning methods. The tree search enables exploration and information gathering, while the deep learning representation helps the agent leverage self-generated experience. Together, these methods provide a symbiotic integration of decision-making methods to effectively explore and learn in unknown design problem spaces. Learning problem-solving strategies from scratch has been achieved in multiple domains [5-8]. Some of these methods use a dual-process decision-making framework which is often compared to the slow and fast thinking ideology [9]s.
Maximizing Submodular or Monotone Approximately Submodular Functions by Multi-objective Evolutionary Algorithms
Qian, Chao, Yu, Yang, Tang, Ke, Yao, Xin, Zhou, Zhi-Hua
Evolutionary algorithms (EAs) are a kind of nature-inspired general-purpose optimization algorithm, and have shown empirically good performance in solving various real-word optimization problems. During the past two decades, promising results on the running time analysis (one essential theoretical aspect) of EAs have been obtained, while most of them focused on isolated combinatorial optimization problems, which do not reflect the general-purpose nature of EAs. To provide a general theoretical explanation of the behavior of EAs, it is desirable to study their performance on general classes of combinatorial optimization problems. To the best of our knowledge, the only result towards this direction is the provably good approximation guarantees of EAs for the problem class of maximizing monotone submodular functions with matroid constraints. The aim of this work is to contribute to this line of research. Considering that many combinatorial optimization problems involve non-monotone or non-submodular objective functions, we study the general problem classes, maximizing submodular functions with/without a size constraint and maximizing monotone approximately submodular functions with a size constraint. We prove that a simple multi-objective EA called GSEMO-C can generally achieve good approximation guarantees in polynomial expected running time.
How Binary Search Trees work part2(Advanced Algorithms)
Abstract: Motivated by recent developments in optical switching and reconfigurable network design, we study dynamic binary search trees (BSTs) in the matching model. In the classical dynamic BST model, the cost of both link traversal and basic reconfiguration (rotation) is O(1). However, in the matching model, the BST is defined by two optical switches (that represent two matchings in an abstract way), and each switch (or matching) reconfiguration cost is α while a link traversal cost is still O(1). In this work, we propose Arithmetic BST (A-BST), a simple dynamic BST algorithm that is based on dynamic Shannon-Fano-Elias coding, and show that A-BST is statically optimal for sequences of length Ω(nαlogα) where n is the number of nodes (keys) in the tree. Abstract: The dynamic optimality conjecture, postulating the existence of an O(1)-competitive online algorithm for binary search trees (BSTs), is among the most fundamental open problems in dynamic data structures.
Knowledge Retrieval Using Functional Object-Oriented Networks
Robotic agents often perform tasks that transform sets of input objects into output objects through functional motions. This work describes the FOON knowledge representation model for robotic tasks. We define the structure and key components of FOON and describe the process we followed to create our universal FOON dataset. The paper describes various search algorithms and heuristic functions we used to search for objects within the FOON. We performed multiple searches on our universal FOON using these algorithms and discussed the effectiveness of each algorithm.