Search
Boy, 11, makes portrait of world leader from 1,764 Rubik's Cubes, sets sights on breaking world record
Fox News Flash top headlines are here. Check out what's clicking on Foxnews.com. A schoolboy has completed one of his largest portraits yet by using over 1,500 Rubik's Cubes to resemble the prime minister of the United Kingdom. Henil Soni is an 11-year-old from Harwich, Essex, England, who began his infatuation with the handheld puzzle when he was just five years old, according to SWNS, the British news service. Soni, who can now solve the well-known puzzle in mere seconds, is taking his talents to a larger scale by making portraits out of the colors on the cube.
Fast and Efficient Local Search for Genetic Programming Based Loss Function Learning
Raymond, Christian, Chen, Qi, Xue, Bing, Zhang, Mengjie
In this paper, we develop upon the topic of loss function learning, an emergent meta-learning paradigm that aims to learn loss functions that significantly improve the performance of the models trained under them. Specifically, we propose a new meta-learning framework for task and model-agnostic loss function learning via a hybrid search approach. The framework first uses genetic programming to find a set of symbolic loss functions. Second, the set of learned loss functions is subsequently parameterized and optimized via unrolled differentiation. The versatility and performance of the proposed framework are empirically validated on a diverse set of supervised learning tasks. Results show that the learned loss functions bring improved convergence, sample efficiency, and inference performance on tabulated, computer vision, and natural language processing problems, using a variety of task-specific neural network architectures.
Deep Confident Steps to New Pockets: Strategies for Docking Generalization
Corso, Gabriele, Deng, Arthur, Fry, Benjamin, Polizzi, Nicholas, Barzilay, Regina, Jaakkola, Tommi
Accurate blind docking has the potential to lead to new biological breakthroughs, but for this promise to be realized, docking methods must generalize well across the proteome. Existing benchmarks, however, fail to rigorously assess generalizability. We carefully analyze the scaling laws of ML-based docking and show that, by scaling data and model size, as well as integrating synthetic data strategies, we are able to significantly increase the generalization capacity and set new state-of-the-art performance across benchmarks. Understanding how small molecules and proteins interact, a task known as molecular docking, is at the heart of drug discovery. The conventional use of docking in the industry has led the field to focus on finding binding conformations when restricting the search to predefined pockets and evaluating these on a relatively limited set of protein families of commercial interest. For example, it would help us understand the mechanism of action of new drugs to accelerate their development [Schottlender et al., 2022], predict adverse side-effects of drugs before clinical trials [Luo et al., 2018], and discover the function of the vast number of enzymes and membrane proteins whose biology we do not yet know [Yi et al., 2015]. All these tasks critically require the docking methods to generalize beyond the relatively small class of well-studied proteins for which we have many available structures. Existing docking benchmarks are largely built on collections of similar binding modes and fail to rigorously assess the ability of docking methods to generalize across the proteome. Gathering diverse data for protein-ligand interactions is challenging because binding pockets tend to be evolutionarily well-conserved due to their critical biological functions. Therefore, a large proportion of known interactions fall into a relatively small set of common binding modes. The results show that increasing both data and model can give significant generalization improvements.
Learning to Deliver: a Foundation Model for the Montreal Capacitated Vehicle Routing Problem
Chin, Samuel J. K., Winkenbach, Matthias, Srivastava, Akash
In this paper, we present the Foundation Model for the Montreal Capacitated Vehicle Routing Problem (FM-MCVRP), a novel Deep Learning (DL) model that approximates high-quality solutions to a variant of the Capacitated Vehicle Routing Problem (CVRP) that characterizes many real-world applications. The so-called Montreal Capacitated Vehicle Routing Problem (MCVRP), first formally described by Bengio et al. (2021), is defined on a fixed and finite graph, which is analogous to a city. Each MCVRP instance is essentially the sub-graph connecting a randomly sampled subset of the nodes in the fixed graph, which represent a set of potential addresses in a real-world delivery problem on a given day. Our work exploits this problem structure to frame the MCVRP as an analogous Natural Language Processing (NLP) task. Specifically, we leverage a Transformer architecture embedded in a Large Language Model (LLM) framework to train our model in a supervised manner on computationally inexpensive, sub-optimal MCVRP solutions obtained algorithmically. Through comprehensive computational experiments, we show that FM-MCVRP produces better MCVRP solutions than the training data and generalizes to larger sized problem instances not seen during training. Even when compared to near-optimal solutions from state-of-the-art heuristics, FM-MCVRP yields competitive results despite being trained on inferior data. For instance, for 400-customer problems, FM-MCVRP solutions on average fall within 2% of the benchmark. Our results further demonstrate that unlike prior works in the literature, FM-MCVRP is a unified model, which performs consistently and reliably on a range of problem instance sizes and parameter values such as the vehicle capacity.
A Call for Clarity in Beam Search: How It Works and When It Stops
Kasai, Jungo, Sakaguchi, Keisuke, Bras, Ronan Le, Radev, Dragomir, Choi, Yejin, Smith, Noah A.
Text generation with beam search has proven successful in a wide range of applications. We point out that, though largely overlooked in the literature, the commonly-used implementation of beam decoding (e.g., Hugging Face Transformers and fairseq) uses a first come, first served heuristic: it keeps a set of already completed sequences over time steps and stops when the size of this set reaches the beam size. Based on this finding, we introduce a patience factor, a simple modification to this beam decoding implementation, that generalizes the stopping criterion and provides flexibility to the depth of search. Empirical results demonstrate that adjusting this patience factor improves decoding performance of strong pretrained models on news text summarization and machine translation over diverse language pairs, with a negligible inference slowdown. Our approach only modifies one line of code and can be thus readily incorporated in any implementation. Further, we find that different versions of beam decoding result in large performance differences in summarization, demonstrating the need for clarity in specifying the beam search implementation in research work. Our code will be available upon publication.
Automated Machine Learning for Multi-Label Classification
Automated machine learning (AutoML) aims to select and configure machine learning algorithms and combine them into machine learning pipelines tailored to a dataset at hand. For supervised learning tasks, most notably binary and multinomial classification, aka single-label classification (SLC), such AutoML approaches have shown promising results. However, the task of multi-label classification (MLC), where data points are associated with a set of class labels instead of a single class label, has received much less attention so far. In the context of multi-label classification, the data-specific selection and configuration of multi-label classifiers are challenging even for experts in the field, as it is a high-dimensional optimization problem with multi-level hierarchical dependencies. While for SLC, the space of machine learning pipelines is already huge, the size of the MLC search space outnumbers the one of SLC by several orders. In the first part of this thesis, we devise a novel AutoML approach for single-label classification tasks optimizing pipelines of machine learning algorithms, consisting of two algorithms at most. This approach is then extended first to optimize pipelines of unlimited length and eventually configure the complex hierarchical structures of multi-label classification methods. Furthermore, we investigate how well AutoML approaches that form the state of the art for single-label classification tasks scale with the increased problem complexity of AutoML for multi-label classification. In the second part, we explore how methods for SLC and MLC could be configured more flexibly to achieve better generalization performance and how to increase the efficiency of execution-based AutoML systems.
JCLEC-MO: a Java suite for solving many-objective optimization engineering problems
Ramírez, Aurora, Romero, José Raúl, García-Martínez, Carlos, Ventura, Sebastián
Hence, the use of efficient search methods has experienced a significant growth in the last years, specially for those engineering problems where there are multiple objectives that require to be simultaneously optimized (Marler and Arora, 2004). A recurrent situation in engineering is the need of jointly optimizing energy consumption, cost or time, among others. All these factors constitute a paramount concern to the expert, and represent conflicting objectives, each one having a deep impact on the final solution (Marler and Arora, 2004). Initially applied to single-objective problems, metaheuristics like evolutionary algorithms (EAs) have been successfully applied to the resolution of multi-objective problems (MOPs) in engineering, such as the design of efficient transport systems (Domínguez et al., 2014) or safe civil structures (Zavala et al., 2014). The presence of a large number of objectives has been recently pointed out as an intrinsic characteristic of engineering problems (Singh, 2016), for which the currently applied techniques might not be efficient enough. It is noteworthy that other communities are also demanding novel techniques to face increasingly complex problems, what has led to the appearance of the many-objective optimization approach(von Lücken et al., 2014; Li et al., 2015). This variant of the more general multi-objective optimization (MOO) is specifically devoted to overcome the limits of existing algorithms when problems having 4 or more objectives, known as many-objective problems (MaOPs), have to be faced. Even though each metaheuristic follows different principles to conduct the search, their adaptation to deal with either MOPs or MaOPs share some similarities, such as the presence of new diversity preservation mechanisms or the use of indicators (Li et al., 2015; Mishra et al., 2015). The resulting many-objective algorithms have proven successful in the engineering field too (Li and Hu, 2014; López-Jaimes and Coello Coello, 2014; Cheng et al., 2017), where specialized software tools have begun to appear (Hadka et al., 2015).
Evolving machine learning workflows through interactive AutoML
Barbudo, Rafael, Ramírez, Aurora, Romero, José Raúl
Automatic workflow composition (AWC) is a relevant problem in automated machine learning (AutoML) that allows finding suitable sequences of preprocessing and prediction models together with their optimal hyperparameters. This problem can be solved using evolutionary algorithms and, in particular, grammar-guided genetic programming (G3P). Current G3P approaches to AWC define a fixed grammar that formally specifies how workflow elements can be combined and which algorithms can be included. In this paper we present \ourmethod, an interactive G3P algorithm that allows users to dynamically modify the grammar to prune the search space and focus on their regions of interest. Our proposal is the first to combine the advantages of a G3P method with ideas from interactive optimisation and human-guided machine learning, an area little explored in the context of AutoML. To evaluate our approach, we present an experimental study in which 20 participants interact with \ourmethod to evolve workflows according to their preferences. Our results confirm that the collaboration between \ourmethod and humans allows us to find high-performance workflows in terms of accuracy that require less tuning time than those found without human intervention.
Deep Sensitivity Analysis for Objective-Oriented Combinatorial Optimization
Gireesan, Ganga, Pillai, Nisha, Rothrock, Michael J, Nanduri, Bindu, Chen, Zhiqian, Ramkumar, Mahalingam
Pathogen control is a critical aspect of modern poultry farming, providing important benefits for both public health and productivity. Effective poultry management measures to reduce pathogen levels in poultry flocks promote food safety by lowering risks of food-borne illnesses. They also support animal health and welfare by preventing infectious diseases that can rapidly spread and impact flock growth, egg production, and overall health. This study frames the search for optimal management practices that minimize the presence of multiple pathogens as a combinatorial optimization problem. Specifically, we model the various possible combinations of management settings as a solution space that can be efficiently explored to identify configurations that optimally reduce pathogen levels. This design incorporates a neural network feedback-based method that combines feature explanations with global sensitivity analysis to ensure combinatorial optimization in multiobjective settings. Our preliminary experiments have promising results when applied to two real-world agricultural datasets. While further validation is still needed, these early experimental findings demonstrate the potential of the model to derive targeted feature interactions that adaptively optimize pathogen control under varying real-world constraints.
SequentialAttention++ for Block Sparsification: Differentiable Pruning Meets Combinatorial Optimization
Yasuda, Taisuke, Axiotis, Kyriakos, Fu, Gang, Bateni, MohammadHossein, Mirrokni, Vahab
Neural network pruning is a key technique towards engineering large yet scalable, interpretable, and generalizable models. Prior work on the subject has developed largely along two orthogonal directions: (1) differentiable pruning for efficiently and accurately scoring the importance of parameters, and (2) combinatorial optimization for efficiently searching over the space of sparse models. We unite the two approaches, both theoretically and empirically, to produce a coherent framework for structured neural network pruning in which differentiable pruning guides combinatorial optimization algorithms to select the most important sparse set of parameters. Theoretically, we show how many existing differentiable pruning techniques can be understood as nonconvex regularization for group sparse optimization, and prove that for a wide class of nonconvex regularizers, the global optimum is unique, group-sparse, and provably yields an approximate solution to a sparse convex optimization problem. The resulting algorithm that we propose, SequentialAttention++, advances the state of the art in large-scale neural network block-wise pruning tasks on the ImageNet and Criteo datasets.