Optimization
A Survey of Optimization Methods from a Machine Learning Perspective
Sun, Shiliang, Cao, Zehui, Zhu, Han, Zhao, Jing
Machine learning develops rapidly, which has made many theoretical breakthroughs and is widely applied in various fields. Optimization, as an important part of machine learning, has attracted much attention of researchers. With the exponential growth of data amount and the increase of model complexity, optimization methods in machine learning face more and more challenges. A lot of work on solving optimization problems or improving optimization methods in machine learning has been proposed successively. The systematic retrospect and summary of the optimization methods from the perspective of machine learning are of great significance, which can offer guidance for both developments of optimization and machine learning research. In this paper, we first describe the optimization problems in machine learning. Then, we introduce the principles and progresses of commonly used optimization methods. Next, we summarize the applications and developments of optimization methods in some popular machine learning fields. Finally, we explore and give some challenges and open problems for the optimization in machine learning.
A gray-box approach for curriculum learning
Foglino, Francesco, Leonetti, Matteo, Sagratella, Simone, Seccia, Ruggiero
Curriculum learning is often employed in deep reinforcement learning to let the agent progress more quickly towards better behaviors. Numerical methods for curriculum learning in the literature provides only initial heuristic solutions, with little to no guarantee on their quality. We define a new gray-box function that, including a suitable scheduling problem, can be effectively used to reformulate the curriculum learning problem. We propose different efficient numerical methods to address this gray-box reformulation. Preliminary numerical results on a benchmark task in the curriculum learning literature show the viability of the proposed approach.
A Graph-Based Decoding Model for Incomplete Multi-Subject fMRI Functional Alignment
Li, Weida, Chen, Fang, Zhang, Daoqiang
Aggregating multi-subject fMRI data is indispensable for generating valid and general inferences from patterns distributed across human brains. The disparities in anatomical structures and functional topographies of human brains call for aligning fMRI data across subjects. However, the existing functional alignment methods cannot tackle various kinds of fMRI datasets today, especially when they are incomplete, i.e., some of the subjects probably lack the responses to some stimuli, or different subjects might follow different sequences of stimuli. In this paper, a cross-subject graph that depicts the (dis)similarities between samples across subjects is taken as prior information for developing a more flexible framework that suits an assortment of fMRI datasets. However, the high dimension of fMRI data and the use of multiple subjects makes the crude framework time-consuming or unpractical. Therefore, we regularize the framework so that a feasible kernel-based optimization, which permits non-linear feature extraction, could be theoretically developed. Specifically, a low-dimension assumption is imposed on each new feature space to avoid overfitting caused by the high-spatial-low-temporal resolution of fMRI data. Empirical studies confirm that the proposed method under both incompleteness and completeness can achieve better performance than other state-of-the-art functional alignment methods under completeness.
Reinforcement Learning Driven Heuristic Optimization
Cai, Qingpeng, Hang, Will, Mirhoseini, Azalia, Tucker, George, Wang, Jingtao, Wei, Wei
Heuristic algorithms such as simulated annealing, Concorde, and METIS are effective and widely used approaches to find solutions to combinatorial optimization problems. However, they are limited by the high sample complexity required to reach a reasonable solution from a cold-start. In this paper, we introduce a novel framework to generate better initial solutions for heuristic algorithms using reinforcement learning (RL), named RLHO. We augment the ability of heuristic algorithms to greedily improve upon an existing initial solution generated by RL, and demonstrate novel results where RL is able to leverage the performance of heuristics as a learning signal to generate better initialization. We apply this framework to Proximal Policy Optimization (PPO) and Simulated Annealing (SA). We conduct a series of experiments on the well-known NP-complete bin packing problem, and show that the RLHO method outperforms our baselines. We show that on the bin packing problem, RL can learn to help heuristics perform even better, allowing us to combine the best parts of both approaches.
Optimal Convergence for Stochastic Optimization with Multiple Expectation Constraints
In this paper, we focus on the problem of stochastic optimization where the objective function can be written as an expectation function over a closed convex set. We also consider multiple expectation constraints which restrict the domain of the problem. We extend the cooperative stochastic approximation algorithm from Lan and Zhou [2016] to solve the particular problem. We close the gaps in the previous analysis and provide a novel proof technique to show that our algorithm attains the optimal rate of convergence for both optimality gap and constraint violation when the functions are generally convex. We also compare our algorithm empirically to the state-of-the-art and show improved convergence in many situations.
Global optimization via inverse distance weighting
Global optimization problems whose objective function is expensive to evaluate can be solved effectively by recursively fitting a surrogate function to function samples and minimizing an acquisition function to generate new samples. The acquisition step trades off between seeking for a new optimization vector where the surrogate is minimum (exploitation of the surrogate) and looking for regions of the feasible space that have not yet been visited and that may potentially contain better values of the objective function (exploration of the feasible space). This paper proposes a new global optimization algorithm that uses a combination of inverse distance weighting (IDW) and radial basis functions (RBF) to construct the acquisition function. Rather arbitrary constraints that are simple to evaluate can be easily taken into account by the approach. Compared to Bayesian optimization, the proposed algorithm is computationally lighter and, as we show in a set of benchmark global optimization and hyperparameter tuning problems, it has a very similar (and sometimes superior) performance. MATLAB and Python implementations of the proposed approach are available at http://cse.lab.imtlucca.it/~bemporad/idwgopt
An efficient Lagrangian-based heuristic to solve a multi-objective sustainable supply chain problem
Tautenhain, Camila P. S., Barbosa-Povoa, Ana Paula, Mota, Bruna, Nascimento, Mariá C. V.
Sustainable Supply Chain (SSC) management aims at integrating economic, environmental and social goals to assist in the long-term planning of a company and its supply chains. There is no consensus in the literature as to whether social and environmental responsibilities are profit-compatible. However, the conflicting nature of these goals is explicit when considering specific assessment measures and, in this scenario, multi-objective optimization is a way to represent problems that simultaneously optimize the goals. This paper proposes a Lagrangian matheuristic method, called $AugMathLagr$, to solve a hard and relevant multi-objective problem found in the literature. $AugMathLagr$ was extensively tested using artificial instances defined by a generator presented in this paper. The results show a competitive performance of $AugMathLagr$ when compared with an exact multi-objective method limited by time and a matheuristic recently proposed in the literature and adapted here to address the studied problem. In addition, computational results on a case study are presented and analyzed, and demonstrate the outstanding performance of $AugMathLagr$.
Distributed Optimization for Over-Parameterized Learning
Distributed optimization often consists of two updating phases: local optimization and inter-node communication. Conventional approaches require working nodes to communicate with the server every one or few iterations to guarantee convergence. In this paper, we establish a completely different conclusion that each node can perform an arbitrary number of local optimization steps before communication. Moreover, we show that the more local updating can reduce the overall communication, even for an infinity number of steps where each node is free to update its local model to near-optimality before exchanging information. The extra assumption we make is that the optimal sets of local loss functions have a non-empty intersection, which is inspired by the over-paramterization phenomenon in large-scale optimization and deep learning. Our theoretical findings are confirmed by both distributed convex optimization and deep learning experiments.
A Unified Framework of Robust Submodular Optimization
In this paper, we shall study a unified framework of robust submodular optimization. We study this problem both from a minimization and maximization perspective (previous work has only focused on variants of robust submodular maximization). We do this under a broad range of combinatorial constraints including cardinality, knapsack, matroid as well as graph based constraints such as cuts, paths, matchings and trees. Furthermore, we also study robust submodular minimization and maximization under multiple submodular upper and lower bound constraints. We show that all these problems are motivated by important machine learning applications including robust data subset selection, robust co-operative cuts and robust co-operative matchings. In each case, we provide scalable approximation algorithms and also study hardness bounds. Finally, we empirically demonstrate the utility of our algorithms on real world applications.