Optimization
Optimization techniques: Finding maxima and minima
I assume the small difference is because of the approximation differences in R. However the results are very close. One important point to remember about gradient descent is how we choose the learning rate. If it is too small, it will take forever to converge but if it is too large, we will never find a minima. From the plot above it easily understood that, in the initial stages, we were able to take big steps. As we get closer to the minima the descent starts slowing down.
A Planner for Both Satisfaction and Optimization Problems
The way work load is shared between these stages depends on the particular approach in use. The reader unfamiliar with Petri nets needs only know the following: They are made of places, transitions, and tokens. A place can be seen as a token holder. When it contains one or more tokens, it is said to be marked. Transitions allow tokens to circulate from place to place.
Multiobjective Optimization
Using some real-world examples I illustrate the important role of multiobjective optimization in decision making and its interface with preference handling. I explain what optimization in the presence of multiple objectives means and discuss some of the most common methods of solving multiobjective optimization problems using transformations to single-objective optimization problems. Finally, I address linear and combinatorial optimization problems with multiple objectives and summarize techniques for solving them. Throughout the article I refer to the real-world examples introduced at the beginning. There are infinitely many ways to invest money and infinitely many possible radiotherapy treatments, but the number of feasible crew schedules is finite, albeit astronomical in practice.
Heuristic Search and Information Visualization Methods for School Redistricting
We describe an application of AI search and information visualization techniques to the problem of school redistricting, in which students are assigned to home schools within a county or school district. This is a multicriteria optimization problem in which competing objectives, such as school capacity, busing costs, and socioeconomic distribution, must be considered. Because of the complexity of the decision-making problem, tools are needed to help end users generate, evaluate, and compare alternative school assignment plans. A key goal of our research is to aid users in finding multiple qualitatively different redistricting plans that represent different tradeoffs in the decision space. We present heuristic search methods that can be used to find a set of qualitatively different plans, and give empirical results of these search methods on population data from the school district of Howard County, Maryland.
Energy and Uncertainty: Models and Algorithms for Complex Energy Systems
I highlight several of these applications, using a simple energy storage problem as a case application. Using this setting, I describe a modeling framework that is based on five fundamental dimensions and that is more natural than the standard canonical form widely used in the reinforcement learning community. The framework focuses on finding the best policy, where I identify four fundamental classes of policies consisting of policy function approximations (PFAs), cost function approximations (CFAs), policies based on value function approximations (VFAs), and look-ahead policies. There is the familiar array of decisions: discrete actions, continuous controls, and vector-valued (and possibly integer) decisions. The tools for these problems are drawn from computer science, engineering, applied math, and operations research.
Fully Automated Design of Super-High-Rise Building Structures by a Hybrid AI Model on a Massively Parallel Machine
This article presents an innovative research project (sponsored by the National Science Foundation, the American Iron and Steel Institute, and the American Institute of Steel Construction) where computationally elegant algorithms based on the integration of a novel connectionist computing model, mathematical optimization, and a massively parallel computer architecture are used to automate the complex process of engineering design. Adeli and his associates have been working on creating novel design theories and computational models with two broad objectives: (1) automation and (2) optimization (Adeli and Hung 1995; Adeli and Kamal 1993; Adeli and Zhang 1993; Adeli and Yeh 1989; Adeli and Balasubramanyam 1988a, 1998b; Paek and Adeli 1988; Adeli and Alrijleh 1987). Civil-engineering structures are typically one of a kind as opposed to manufacturing designs that are often mass produced. To create computational models for structural design automation, we have been exploring new computing paradigms. Two such paradigms are neurocomputing and parallel processing.
PHOENICS: A universal deep Bayesian optimizer
Häse, Florian, Roch, Loïc M., Kreisbeck, Christoph, Aspuru-Guzik, Alán
In this work we introduce PHOENICS, a probabilistic global optimization algorithm combining ideas from Bayesian optimization with concepts from Bayesian kernel density estimation. We propose an inexpensive acquisition function balancing the explorative and exploitative behavior of the algorithm. This acquisition function enables intuitive sampling strategies for an efficient parallel search of global minima. The performance of PHOENICS is assessed via an exhaustive benchmark study on a set of 15 discrete, quasi-discrete and continuous multidimensional functions. Unlike optimization methods based on Gaussian processes (GP) and random forests (RF), we show that PHOENICS is less sensitive to the nature of the co-domain, and outperforms GP and RF optimizations. We illustrate the performance of PHOENICS on the Oregonator, a difficult case-study describing a complex chemical reaction network. We demonstrate that only PHOENICS was able to reproduce qualitatively and quantitatively the target dynamic behavior of this nonlinear reaction dynamics. We recommend PHOENICS for rapid optimization of scalar, possibly non-convex, black-box unknown objective functions.
A Constraint-Based Dental School Timetabling System
This system has been deployed since 2010. Dental school timetabling differs from other university course scheduling in that certain clinic sessions can be used by multiple courses at the same time, provided a limit on room capacity is satisfied. Starting from a constraint-programming solution using a web interface, we have moved to a mixed integer programming-based solver to deal with multiple objective functions, along with a dedicated Java application, which provides a rich user interface. Solutions for the years 2010, 2011, and 2012 have been used in the dental school, replacing a manual timetabling process, which could no longer cope with increasing student numbers and resulting resource bottlenecks. The use of the automated system allowed the dental school to increase the number of students enrolled to the maximum possible given the available resources.
Mathematical Optimization: 'simplicity is all you need' • r/MachineLearning
What does it mean for you to do "y * dy/dx" and "y (dy/dx)2"? I mean, are you doing elementwise operations? Also could you report the magnitude on the plot of both "y" and "dy/dx"? My guess is that the norm of the gradient might be very small and thus y dy/dx 2 and you would basically be doing SGD. Also, I don't think you finetuned well SGD RMSprop and so on as they should also give excellent results on MNIST.