Optimization
DOE Announces $15 Million for Development of AI and Machine Learning Tools - DATAVERSITY
According to a recent press release, "Today, the U.S. Department of Energy's (DOE's) Advanced Research Projects Agency-Energy (ARPA-E) announced $15 million in funding for 23 projects to accelerate the incorporation of machine learning and artificial intelligence into the energy technology and product design processes as part of the Design Intelligence Fostering Formidable Energy Reduction (and) Enabling Novel Totally Impactful Advanced Technology Enhancements (DIFFERENTIATE) program. Launched in April of this year, the DIFFERENTIATE program aims to develop streamlined solutions to next-generation energy challenges. The program identified three general mathematical optimization problems that are common to many design processes. The selected projects then conceptualized machine learning and artificial intelligence-based solutions to help engineers execute and solve these problems in a manner that dramatically accelerates the pace of energy innovation." The release goes on, "Following the initial round of Phase I funding for the DIFFERENTIATE program, additional funding will be available to qualifying awardees at a future date… DIFFERENTIATE projects include: Iowa State University – Ames, Iowa. Iowa State University will develop machine learning tools to accelerate the inverse design of new microstructures in photovoltaics. The team will create a new deep generative model to combat challenges in real-world inverse design problems. The proposed inverse design tools, if successful, will produce novel, manufacturable material microstructures with improved electromagnetic properties relative to existing technology."
Clustering via Ant Colonies: Parameter Analysis and Improvement of the Algorithm
Chavarria-Molina, Jeffry, Fallas-Monge, Juan Jose, Trejos-Zelaya, Javier
An ant colony optimization approach for partitioning a set of objects is proposed. In order to minimize the intra-variance, or within sum-of-squares, of the partitioned classes, we construct ant-like solutions by a constructive approach that selects objects to be put in a class with a probability that depends on the distance between the object and the centroid of the class (visibility) and the pheromone trail; the latter depends on the class memberships that have been defined along the iterations. The procedure is improved with the application of K-means algorithm in some iterations of the ant colony method. We performed a simulation study in order to evaluate the method with a Monte Carlo experiment that controls some sensitive parameters of the clustering problem. After some tuning of the parameters, the method has also been applied to some benchmark real-data sets. Encouraging results were obtained in nearly all cases.
Diagnostic Curves for Black Box Models
Inouye, David I., Leqi, Liu, Kim, Joon Sik, Aragam, Bryon, Ravikumar, Pradeep
In safety-critical applications of machine learning, it is often necessary to look beyond standard metrics such as test accuracy in order to validate various qualitative properties such as monotonicity with respect to a feature or combination of features, checking for undesirable changes or oscillations in the response, and differences in outcomes (e.g. discrimination) for a protected class. To help answer this need, we propose a framework for approximately validating (or invalidating) various properties of a black box model by finding a univariate diagnostic curve in the input space whose output maximally violates a given property. These diagnostic curves show the exact value of the model along the curve and can be displayed with a simple and intuitive line graph. We demonstrate the usefulness of these diagnostic curves across multiple use-cases and datasets including selecting between two models and understanding out-of-sample behavior.
Efficient Relaxed Gradient Support Pursuit for Sparsity Constrained Non-convex Optimization
Shang, Fanhua, Wei, Bingkun, Liu, Hongying, Liu, Yuanyuan, Zhuo, Jiacheng
Large-scale non-convex sparsity-constrained problems have recently gained extensive attention. Most existing deterministic optimization methods (e.g., GraSP) are not suitable for large-scale and high-dimensional problems, and thus stochastic optimization methods with hard thresholding (e.g., SVRGHT) become more attractive. Inspired by GraSP, this paper proposes a new general relaxed gradient support pursuit (RGraSP) framework, in which the sub-algorithm only requires to satisfy a slack descent condition. We also design two specific semi-stochastic gradient hard thresholding algorithms. In particular, our algorithms have much less hard thresholding operations than SVRGHT, and their average per-iteration cost is much lower (i.e., O(d) vs. O(d log(d)) for SVRGHT), which leads to faster convergence. Our experimental results on both synthetic and real-world datasets show that our algorithms are superior to the state-of-the-art gradient hard thresholding methods.
ExperienceThinking: Hyperparameter Optimization with Budget Constraints
Wang, Chunnan, Wang, Hongzhi, Zhou, Chang, Chen, Hanxiao, Li, Jianzhong, Gao, Hong
The problem of hyperparameter optimization exists widely in the real life and many common tasks can be transformed into it, such as neural architecture search and feature subset selection. Without considering various constraints, the existing hyperparameter tuning techniques can solve these problems effectively by traversing as many hyperparameter configurations as possible. However, because of the limited resources and budget, it is not feasible to evaluate so many kinds of configurations, which requires us to design effective algorithms to find a best possible hyperparameter configuration with a finite number of configuration evaluations. In this paper, we simulate human thinking processes and combine the merit of the existing techniques, and thus propose a new algorithm called ExperienceThinking, trying to solve this constrained hyperparameter optimization problem. In addition, we analyze the performances of 3 classical hyperparameter optimization algorithms with a finite number of configuration evaluations, and compare with that of ExperienceThinking. The experimental results show that our proposed algorithm provides superior results and has better performance.
Adaptive Divergence for Rapid Adversarial Optimization
Borisyak, Maxim, Gaintseva, Tatiana, Ustyuzhanin, Andrey
Adversarial Optimization (AO) provides a reliable, practical way to match two implicitly defined distributions, one of which is usually represented by a sample of real data, and the other is defined by a generator. Typically, AO involves training of a high-capacity model on each step of the optimization. In this work, we consider computationally heavy generators, for which training of high-capacity models is associated with substantial computational costs. To address this problem, we introduce a novel family of divergences, which varies the capacity of the underlying model, and allows for a significant acceleration with respect to the number of samples drawn from the generator. We demonstrate the performance of the proposed divergences on several tasks, including tuning parameters of a physics simulator, namely, Pythia event generator.
Bayesian Optimization Approach for Analog Circuit Synthesis Using Neural Network
Zhang, Shuhan, Lyu, Wenlong, Yang, Fan, Yan, Changhao, Zhou, Dian, Zeng, Xuan
Bayesian optimization with Gaussian process as surrogate model has been successfully applied to analog circuit synthesis. In the traditional Gaussian process regression model, the kernel functions are defined explicitly. The computational complexity of training is O(N 3 ), and the computation complexity of prediction is O(N 2 ), where N is the number of training data. Gaussian process model can also be derived from a weight space view, where the original data are mapped to feature space, and the kernel function is defined as the inner product of nonlinear features. In this paper, we propose a Bayesian optimization approach for analog circuit synthesis using neural network. We use deep neural network to extract good feature representations, and then define Gaussian process using the extracted features. Model averaging method is applied to improve the quality of uncertainty prediction. Compared to Gaussian process model with explicitly defined kernel functions, the neural-network-based Gaussian process model can automatically learn a kernel function from data, which makes it possible to provide more accurate predictions and thus accelerate the follow-up optimization procedure. Also, the neural-network-based model has O(N) training time and constant prediction time. The efficiency of the proposed method has been verified by two real-world analog circuits.
Stochastic learning control of inhomogeneous quantum ensembles
Stochastic learning control of inhomogeneous quantum ensembles Gabriel Turinici IUF - Institut Universitaire de France CEREMADE, Universit e Paris Dauphine - PSL Research University Oct 2019 Abstract In quantum control, the robustness with respect to uncertainties in the system's parameters or driving field characteristics is of paramount importance and has been studied theoretically, numerically and experimentally. We test in this paper stochastic search procedures (Stochastic gradient descent and the Adam algorithm) that sample, at each iteration, from the distribution of the parameter uncertainty, as opposed to previous approaches that use a fixed grid. We show that both algorithms behave well with respect to benchmarks and discuss their relative merits. In addition the methodology allows to address high dimensional parameter uncertainty; we implement numerically, with good results, a 3D and a 6D case. 1 Introduction Quantum control is a promising technology with many applications ranging from NMR [12] to quantum computing [15] and laser control of quantum dynamics [7]. The controlling field encounters many molecules which although identical in nature may interact differently with the incoming field because of e.g., different Larmor frequencies or rf attenuation factors (in NMR spin control or quantum computing, see [19, 29, 35, 22, 13, 17]), different spatial profile (see [24]) or other parameters (see [36, 8, 10]). For obvious practical reasons, it is of paramount importance to ensure that the control quality is 1 arXiv:1906.02991v3
Link Prediction in the Stochastic Block Model with Outliers
Gaucher, Solenne, Klopp, Olga, Robin, Geneviève
Networks are a powerful tool used to analyze complex systems: agents are represented as nodes, and pairwise interactions between agents are recorded as edges between these nodes. Examples of fields of applications include biology, where networks may be used to describe protein-protein interactions; ecology, where they may represent food webs [13] or spatial distributions in crop diversity networks [46]; ethnology, where networks summarize relationships or trades between individuals or communities [40, 36]; sociology, where the recent development of online social networks offers unprecedented possibilities while fostering new challenges [47]. Real-life networks are often modeled as realizations of random graphs or, equivalently, as noisy versions of more structured networks. In this setting, recovering the "noiseless" version of the graph, i.e. estimating the underlying probabilities of interactions between agents, is a key problem that has recently gained considerable attention (see, e.g., [30, 15, 14, 17, 50]). Most methods for recovering structural properties of a network rely on assumptions on the distribution of the underlying random graph. However, in numerous examples, these assumptions are put in default by the behaviour of a small number of individuals, which strongly departs from the behaviour of the majority of agents, introducing outlier profiles. For example, in graphs obtained from survey data, some individuals may be reluctant to participate and for this reason provide false answers; other individuals may even be paid to provide erroneous answers in order to distort the public opinion on a subject [3].
Square Attack: a query-efficient black-box adversarial attack via random search
Andriushchenko, Maksym, Croce, Francesco, Flammarion, Nicolas, Hein, Matthias
We propose the Square Attack, a new score-based black-box $l_2$ and $l_\infty$ adversarial attack that does not rely on local gradient information and thus is not affected by gradient masking. The Square Attack is based on a randomized search scheme where we select localized square-shaped updates at random positions so that the $l_\infty$- or $l_2$-norm of the perturbation is approximately equal to the maximal budget at each step. Our method is algorithmically transparent, robust to the choice of hyperparameters, and is significantly more query efficient compared to the more complex state-of-the-art methods. In particular, on ImageNet we improve the average query efficiency for various deep networks by a factor of at least $2$ and up to $7$ compared to the recent state-of-the-art $l_\infty$-attack of Meunier et al. while having a higher success rate. The Square Attack can even be competitive to gradient-based white-box attacks in terms of success rate. Moreover, we show its utility by breaking a recently proposed defense based on randomization. The code of our attack is available at https://github.com/max-andr/square-attack