Goto

Collaborating Authors

 Optimization


r/MachineLearning - [R] Discounted Reinforcement Learning Is Not an Optimization Problem

#artificialintelligence

If one policy has greater or equal value than the other, in all states, we might say the policy is better. The policy gradient paper guarantees that locally optimal policies can be found with function approximation. This functional returns either the long-term avg rewards or discounted cumulative rewards from a designated start state. In practice, one would obtain an unbiased estimator for the gradient of this functional w.r.t.


SIGMA : Strengthening IDS with GAN and Metaheuristics Attacks

arXiv.org Machine Learning

An Intrusion Detection System (IDS) is a key cybersecurity tool for network administrators as it identifies malicious traffic and cyberattacks. With the recent successes of machine learning techniques such as deep learning, more and more IDS are now using machine learning algorithms to detect attacks faster. However, these systems lack robustness when facing previously unseen types of attacks. With the increasing number of new attacks, especially against Internet of Things devices, having a robust IDS able to spot unusual and new attacks becomes necessary. This work explores the possibility of leveraging generative adversarial models to improve the robustness of machine learning based IDS. More specifically, we propose a new method named SIGMA, that leverages adversarial examples to strengthen IDS against new types of attacks. Using Generative Adversarial Networks (GAN) and metaheuristics, SIGMA %Our method consists in generates adversarial examples, iteratively, and uses it to retrain a machine learning-based IDS, until a convergence of the detection rate (i.e. until the detection system is not improving anymore). A round of improvement consists of a generative phase, in which we use GANs and metaheuristics to generate instances ; an evaluation phase in which we calculate the detection rate of those newly generated attacks ; and a training phase, in which we train the IDS with those attacks. We have evaluated the SIGMA method for four standard machine learning classification algorithms acting as IDS, with a combination of GAN and a hybrid local-search and genetic algorithm, to generate new datasets of attacks. Our results show that SIGMA can successfully generate adversarial attacks against different machine learning based IDS. Also, using SIGMA, we can improve the performance of an IDS to up to 100\% after as little as two rounds of improvement.


Provable Non-Convex Optimization and Algorithm Validation via Submodularity

arXiv.org Machine Learning

Submodularity is one of the most well-studied properties of problem classes in combinatorial optimization and many applications of machine learning and data mining, with strong implications for guaranteed optimization. In this thesis, we investigate the role of submodularity in provable non-convex optimization and validation of algorithms. A profound understanding which classes of functions can be tractably optimized remains a central challenge for non-convex optimization. By advancing the notion of submodularity to continuous domains (termed "continuous submodularity"), we characterize a class of generally non-convex and non-concave functions -- continuous submodular functions, and derive algorithms for approximately maximizing them with strong approximation guarantees. Meanwhile, continuous submodularity captures a wide spectrum of applications, ranging from revenue maximization with general marketing strategies, MAP inference for DPPs to mean field inference for probabilistic log-submodular models, which renders it as a valuable domain knowledge in optimizing this class of objectives. Validation of algorithms is an information-theoretic framework to investigate the robustness of algorithms to fluctuations in the input/observations and their generalization ability. We investigate various algorithms for one of the paradigmatic unconstrained submodular maximization problem: MaxCut. Due to submodularity of the MaxCut objective, we are able to present efficient approaches to calculate the algorithmic information content of MaxCut algorithms. The results provide insights into the robustness of different algorithmic techniques for MaxCut.


A Machine Learning Framework for Solving High-Dimensional Mean Field Game and Mean Field Control Problems

arXiv.org Machine Learning

Mean field games (MFG) and mean field control (MFC) are critical classes of multi-agent models for efficient analysis of massive populations of interacting agents. Their areas of application span topics in economics, finance, game theory, industrial engineering, crowd motion, and more. In this paper, we provide a flexible machine learning framework for the numerical solution of potential MFG and MFC models. State-of-the-art numerical methods for solving such problems utilize spatial discretization that leads to a curse-of-dimensionality. We approximately solve high-dimensional problems by combining Lagrangian and Eulerian viewpoints and leveraging recent advances from machine learning. More precisely, we work with a Lagrangian formulation of the problem and enforce the underlying Hamilton-Jacobi-Bellman (HJB) equation that is derived from the Eulerian formulation. Finally, a tailored neural network parameterization of the MFG/MFC solution helps us avoid any spatial discretization. Our numerical results include the approximate solution of 100-dimensional instances of optimal transport and crowd motion problems on a standard work station. These results open the door to much-anticipated applications of MFG and MFC models that were beyond reach with existing numerical methods.


Estudo comparativo de meta-heur\'isticas para problemas de colora\c{c}\~oes de grafos

arXiv.org Artificial Intelligence

A classic graph coloring problem is to assign colors to vertices of any graph so that distinct colors are assigned to adjacent vertices. Optimal graph coloring colors a graph with a minimum number of colors, which is its chromatic number . Finding out the chromatic number is a combinatorial optimization problem proven to be computationally intractable, which implies that no algorithm that computes large instances of the problem in a reasonable time is known. F or this reason, approximate methods and metaheuristics form a set of techniques that do not guarantee optimality, but obtain good solutions in a reasonable time. This paper reports a comparative study of the Hill-Climbing, Simulated Annealing, T abu Search, and Iterated Local Search metaheuristics for the classic graph coloring problem considering its time efficiency for processing the DSJC125 and DSJC250 instances of the DIMACS benchmark.


From Reinforcement Learning to Optimal Control: A unified framework for sequential decisions

arXiv.org Artificial Intelligence

There are over 15 distinct communities that work in the general area of sequential decisions and information, often referred to as decisions under uncertainty or stochastic optimization. We focus on two of the most important fields: stochastic optimal control, with its roots in deterministic optimal control, and reinforcement learning, with its roots in Markov decision processes. Building on prior work, we describe a unified framework that covers all 15 different communities, and note the strong parallels with the modeling framework of stochastic optimal control. By contrast, we make the case that the modeling framework of reinforcement learning, inherited from discrete Markov decision processes, is quite limited. Our framework (and that of stochastic control) is based on the core problem of optimizing over policies. We describe four classes of policies that we claim are universal, and show that each of these two fields have, in their own way, evolved to include examples of each of these four classes.


Finite-Time Convergence of Continuous-Time Optimization Algorithms via Differential Inclusions

arXiv.org Machine Learning

In this paper, we propose two discontinuous dynamical systems in continuous time with guaranteed prescribed finite-time local convergence to strict local minima of a given cost function. Our approach consists of exploiting a Lyapunov-based differential inequality for differential inclusions, which leads to finite-time stability and thus finite-time convergence with a provable bound on the settling time. In particular, for exact solutions to the aforementioned differential inequality, the settling-time bound is also exact, thus achieving prescribed finite-time convergence. We thus construct a class of discontinuous dynamical systems, of second order with respect to the cost function, that serve as continuous-time optimization algorithms with finite-time convergence and prescribed convergence time. Finally, we illustrate our results on the Rosenbrock function.


Learning Arbitrary Quantities of Interest from Expensive Black-Box Functions through Bayesian Sequential Optimal Design

arXiv.org Machine Learning

Estimating arbitrary quantities of interest (QoIs) that are non-linear operators of complex, expensive-to-evaluate, black-box functions is a challenging problem due to missing domain knowledge and finite budgets. Bayesian optimal design of experiments (BODE) is a family of methods that identify an optimal design of experiments (DOE) under different contexts, using only in a limited number of function evaluations. Under BODE methods, sequential design of experiments (SDOE) accomplishes this task by selecting an optimal sequence of experiments while using data-driven probabilistic surrogate models instead of the expensive black-box function. Probabilistic predictions from the surrogate model are used to define an information acquisition function (IAF) which quantifies the marginal value contributed or the expected information gained by a hypothetical experiment. The next experiment is selected by maximizing the IAF. A generally applicable IAF is the expected information gain (EIG) about a QoI as captured by the expectation of the Kullback-Leibler divergence between the predictive distribution of the QoI after doing a hypothetical experiment and the current predictive distribution about the same QoI. We model the underlying information source as a fully-Bayesian, non-stationary Gaussian process (FBNSGP), and derive an approximation of the information gain of a hypothetical experiment about an arbitrary QoI conditional on the hyper-parameters The EIG about the same QoI is estimated by sample averages to integrate over the posterior of the hyper-parameters and the potential experimental outcomes. We demonstrate the performance of our method in four numerical examples and a practical engineering problem of steel wire manufacturing. The method is compared to two classic SDOE methods: random sampling and uncertainty sampling.


VLSI Mask Optimization: From Shallow To Deep Learning

arXiv.org Machine Learning

Abstract-- VLSI mask optimization is one of the most critical stages in manufacturability aware design, which is costly due to the complicated mask optimization and lithography simulation. Recent researches have shown prominent advantages of machine learning techniques dealing with complicated and big data problems, which bring potential of dedicated machine learning solution for DFM problems and facilitate the VLSI design cycle. In this paper, we focus on a heterogeneous OPC framework that assists mask layout optimization. Preliminary results show the efficiency and effectiveness of proposed frameworks that have the potential to be alternatives to existing EDA solutions. I Introduction VLSI mask optimization is one of the most critical stages in manufacturability aware design, which is costly due to the complicated mask optimization and lithography simulation. Recent studies have shown prominent advantages of machine learning techniques dealing with complicated and big data problems, which bring the potential of dedicated machine learning solution for DFM problems and facilitate the VLSI design cycle [1, 2].


Active strict saddles in nonsmooth optimization

arXiv.org Machine Learning

Nonconvex optimization techniques are increasingly playing a major role in modern signal processing, high dimensional statistics, and machine learning. A driving theme, fully supported by empirical evidence, is that simple algorithms often work well in highly non-convex and even nonsmooth settings. Gradient descent, for example, often finds points with small objective value, despite existence of many highly suboptimal critical points. A growing body of literature provides one compelling explanation for this phenomenon. Namely, typical smooth objective functions provably satisfy the strict saddle property, meaning each critical point is either a local minimizer or has a direction of strictly negative curvature (e.g., [6, 29, 30, 62, 63]).