Search
Dynamic Surrogate Switching: Sample-Efficient Search for Factorization Machine Configurations in Online Recommendations
Škrlj, Blaž, Schwartz, Adi, Ferlež, Jure, Kopič, Davorin, Ziporin, Naama
Hyperparameter optimization is the process of identifying the appropriate hyperparameter configuration of a given machine learning model with regard to a given learning task. For smaller data sets, an exhaustive search is possible; However, when the data size and model complexity increase, the number of configuration evaluations becomes the main computational bottleneck. A promising paradigm for tackling this type of problem is surrogate-based optimization. The main idea underlying this paradigm considers an incrementally updated model of the relation between the hyperparameter space and the output (target) space; the data for this model are obtained by evaluating the main learning engine, which is, for example, a factorization machine-based model. By learning to approximate the hyperparameter-target relation, the surrogate (machine learning) model can be used to score large amounts of hyperparameter configurations, exploring parts of the configuration space beyond the reach of direct machine learning engine evaluation. Commonly, a surrogate is selected prior to optimization initialization and remains the same during the search. We investigated whether dynamic switching of surrogates during the optimization itself is a sensible idea of practical relevance for selecting the most appropriate factorization machine-based models for large-scale online recommendation. We conducted benchmarks on data sets containing hundreds of millions of instances against established baselines such as Random Forest- and Gaussian process-based surrogates. The results indicate that surrogate switching can offer good performance while considering fewer learning engine evaluations.
Dataset Summarization by K Principal Concepts
We propose the new task of K principal concept identification for dataset summarizarion. The objective is to find a set of K concepts that best explain the variation within the dataset. Concepts are high-level human interpretable terms such as "tiger", "kayaking" or "happy". The K concepts are selected from a (potentially long) input list of candidates, which we denote the concept-bank. The concept-bank may be taken from a generic dictionary or constructed by task-specific prior knowledge. An image-language embedding method (e.g. CLIP) is used to map the images and the concept-bank into a shared feature space. To select the K concepts that best explain the data, we formulate our problem as a K-uncapacitated facility location problem. An efficient optimization technique is used to scale the local search algorithm to very large concept-banks. The output of our method is a set of K principal concepts that summarize the dataset. Our approach provides a more explicit summary in comparison to selecting K representative images, which are often ambiguous. As a further application of our method, the K principal concepts can be used to classify the dataset into K groups. Extensive experiments demonstrate the efficacy of our approach.
Detecting Small Query Graphs in A Large Graph via Neural Subgraph Search
Bai, Yunsheng, Xu, Derek, Sun, Yizhou, Wang, Wei
Recent advances have shown the success of using reinforcement learning and search to solve NP-hard graph-related tasks, such as Traveling Salesman Optimization, Graph Edit Distance computation, etc. However, it remains unclear how one can efficiently and accurately detect the occurrences of a small query graph in a large target graph, which is a core operation in graph database search, biomedical analysis, social group finding, etc. This task is called Subgraph Matching which essentially performs subgraph isomorphism check between a query graph and a large target graph. One promising approach to this classical problem is the "learning-to-search" paradigm, where a reinforcement learning (RL) agent is designed with a learned policy to guide a search algorithm to quickly find the solution without any solved instances for supervision. However, for the specific task of Subgraph Matching, though the query graph is usually small given by the user as input, the target graph is often orders-of-magnitude larger. It poses challenges to the neural network design and can lead to solution and reward sparsity. S with two innovations to tackle the challenges: (1) A novel encoder-decoder neural network architecture to dynamically compute the matching information between the query and the target graphs at each search state; (2) A novel look-ahead loss function for training the policy network. S can significantly improve the subgraph matching performance. With the growing amount of graph data that naturally arises in many domains, solving graph-related tasks via machine learning has gained increasing attention.
Ashish Patel on LinkedIn: #datascience #machinelearning #data
At the point when we stall out throughout everyday life, we attempt to foster a few standards to help us. Essentially, when a model of data scientists doesn't work as expected, they search for this sort of harmonization (Fine-Tuning Process). In my experience with data science, random searches, grid searches, and cross-validation procedures have been demonstrated to be the most successful methods of fine-tuning hyperparameters when I was a new bee and had very little experience with them at the time. I had very few techniques to work with. But now that things have changed, we have a wide range of methods to modify your model using the current framework support, such as Hyperopt, Optuna, NNI, and DEAP, that Python has built-in, so we will see the key ideas from the book that help you to tune your model with modern approaches.
SoftTreeMax: Policy Gradient with Tree Search
Dalal, Gal, Hallak, Assaf, Mannor, Shie, Chechik, Gal
Policy-gradient methods are widely used for learning control policies. They can be easily distributed to multiple workers and reach state-of-the-art results in many domains. Unfortunately, they exhibit large variance and subsequently suffer from high-sample complexity since they aggregate gradients over entire trajectories. At the other extreme, planning methods, like tree search, optimize the policy using single-step transitions that consider future lookahead. These approaches have been mainly considered for value-based algorithms. Planning-based algorithms require a forward model and are computationally intensive at each step, but are more sample efficient. In this work, we introduce SoftTreeMax, the first approach that integrates tree-search into policy gradient. Traditionally, gradients are computed for single state-action pairs. Instead, our tree-based policy structure leverages all gradients at the tree leaves in each environment step. This allows us to reduce the variance of gradients by three orders of magnitude and to benefit from better sample complexity compared with standard policy gradient. On Atari, SoftTreeMax demonstrates up to 5x better performance in faster run-time compared with distributed PPO.
Lazy Probabilistic Roadmaps Revisited
Ramirez, Miquel, Selvaratnam, Daniel, Manzie, Chris
This paper describes a revision of the classic Lazy Probabilistic Roadmaps algorithm (Lazy PRM), that results from pairing PRM and a novel Branch-and-Cut (BC) algorithm. Cuts are dynamically generated constraints that are imposed on minimum cost paths over the geometric graphs selected by PRM. Cuts eliminate paths that cannot be mapped into smooth plans that satisfy suitably defined kinematic constraints. We generate candidate smooth plans by fitting splines to vertices in minimum-cost path. Plans are validated with a recently proposed algorithm that maps them into finite traces, without need to choose a fixed discretization step. Trace elements exactly describe when plans cross constraint boundaries modulo arithmetic precision. We evaluate several planners using our methods over the recently proposed BARN benchmark, and we report evidence of the scalability of our approach.
MARS: A Motif-based Autoregressive Model for Retrosynthesis Prediction
Liu, Jiahan, Yan, Chaochao, Yu, Yang, Lu, Chan, Huang, Junzhou, Ou-Yang, Le, Zhao, Peilin
Retrosynthesis is a major task for drug discovery. It is formulated as a graph-generating problem by many existing approaches. Specifically, these methods firstly identify the reaction center, and break target molecule accordingly to generate synthons. Reactants are generated by either adding atoms sequentially to synthon graphs or directly adding proper leaving groups. However, both two strategies suffer since adding atoms results in a long prediction sequence which increases generation difficulty, while adding leaving groups can only consider the ones in the training set which results in poor generalization. In this paper, we propose a novel end-to-end graph generation model for retrosynthesis prediction, which sequentially identifies the reaction center, generates the synthons, and adds motifs to the synthons to generate reactants. Since chemically meaningful motifs are bigger than atoms and smaller than leaving groups, our method enjoys lower prediction complexity than adding atoms and better generalization than adding leaving groups. Experiments on a benchmark dataset show that the proposed model significantly outperforms previous state-of-the-art algorithms.
Lossy compression of matrices by black-box optimisation of mixed integer nonlinear programming
Kadowaki, Tadashi, Ambai, Mitsuru
In edge computing, suppressing data size is a challenge for machine learning models that perform complex tasks such as autonomous driving, in which computational resources (speed, memory size and power) are limited. Efficient lossy compression of matrix data has been introduced by decomposing it into the product of an integer and real matrices. However, its optimisation is difficult as it requires simultaneous optimisation of an integer and real variables. In this paper, we improve this optimisation by utilising recently developed black-box optimisation (BBO) algorithms with an Ising solver for integer variables. In addition, the algorithm can be used to solve mixed-integer programming problems that are linear and non-linear in terms of real and integer variables, respectively. The differences between the choice of Ising solvers (simulated annealing, quantum annealing and simulated quenching) and the strategies of the BBO algorithms (BOCS, FMQA and their variations) are discussed for further development of the BBO techniques.
Efficient Non-Parametric Optimizer Search for Diverse Tasks
Wang, Ruochen, Xiong, Yuanhao, Cheng, Minhao, Hsieh, Cho-Jui
Efficient and automated design of optimizers plays a crucial role in full-stack AutoML systems. However, prior methods in optimizer search are often limited by their scalability, generability, or sample efficiency. With the goal of democratizing research and application of optimizer search, we present the first efficient, scalable and generalizable framework that can directly search on the tasks of interest. We first observe that optimizer updates are fundamentally mathematical expressions applied to the gradient. Inspired by the innate tree structure of the underlying math expressions, we re-arrange the space of optimizers into a super-tree, where each path encodes an optimizer. This way, optimizer search can be naturally formulated as a path-finding problem, allowing a variety of well-established tree traversal methods to be used as the search algorithm. We adopt an adaptation of the Monte Carlo method to tree search, equipped with rejection sampling and equivalent-form detection that leverage the characteristics of optimizer update rules to further boost the sample efficiency. We provide a diverse set of tasks to benchmark our algorithm and demonstrate that, with only 128 evaluations, the proposed framework can discover optimizers that surpass both human-designed counterparts and prior optimizer search methods.
Explainable Global Fairness Verification of Tree-Based Classifiers
Calzavara, Stefano, Cazzaro, Lorenzo, Lucchese, Claudio, Marcuzzi, Federico
We present a new approach to the global fairness verification of tree-based classifiers. Given a tree-based classifier and a set of sensitive features potentially leading to discrimination, our analysis synthesizes sufficient conditions for fairness, expressed as a set of traditional propositional logic formulas, which are readily understandable by human experts. The verified fairness guarantees are global, in that the formulas predicate over all the possible inputs of the classifier, rather than just a few specific test instances. Our analysis is formally proved both sound and complete. Experimental results on public datasets show that the analysis is precise, explainable to human experts and efficient enough for practical adoption.