AITopics

2004.00663

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > Massachusetts (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(7 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Garrido-Merchán, Eduardo C., Hernández-Lobato, Daniel

Parallel Predictive Entropy Search for Multi-objective Bayesian Optimization with Constraints

arXiv.org Machine LearningApr-1-2020

Real-world problems often involve the optimization of several objectives under multiple constraints. Furthermore, we may not have an expression for each objective or constraint; they may be expensive to evaluate; and the evaluations can be noisy. These functions are referred to as black-boxes. Bayesian optimization (BO) can efficiently solve the problems described. For this, BO iteratively fits a model to the observations of each black-box. The models are then used to choose where to evaluate the black-boxes next, with the goal of solving the optimization problem in a few iterations. In particular, they guide the search for the problem solution, and avoid evaluations in regions of little expected utility. A limitation, however, is that current BO methods for these problems choose a point at a time at which to evaluate the black-boxes. If the expensive evaluations can be carried out in parallel (as when a cluster of computers is available), this results in a waste of resources. Here, we introduce PPESMOC, Parallel Predictive Entropy Search for Multi-objective Optimization with Constraints, a BO strategy for solving the problems described. PPESMOC selects, at each iteration, a batch of input locations at which to evaluate the black-boxes, in parallel, to maximally reduce the entropy of the problem solution. To our knowledge, this is the first batch method for constrained multi-objective BO. We present empirical evidence in the form of synthetic, benchmark and real-world experiments that illustrate the effectiveness of PPESMOC.

constraint, gaussian distribution, matrix, (17 more...)

2004.00601

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Europe > Spain > Galicia > Madrid (0.04)
Europe > Denmark (0.04)

Genre: Research Report (0.81)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningApr-1-2020

Learning to Select Base Classes for Few-shot Classification

Zhou, Linjun, Cui, Peng, Jia, Xu, Yang, Shiqiang, Tian, Qi

Few-shot learning has attracted intensive research attention in recent years. Many methods have been proposed to generalize a model learned from provided base classes to novel classes, but no previous work studies how to select base classes, or even whether different base classes will result in different generalization performance of the learned model. In this paper, we utilize a simple yet effective measure, the Similarity Ratio, as an indicator for the generalization performance of a few-shot model. We then formulate the base class selection problem as a submodular optimization problem over Similarity Ratio. We further provide theoretical analysis on the optimization lower bound of different optimization methods, which could be used to identify the most appropriate algorithm for different experimental settings. The extensive experiments on ImageNet, Caltech256 and CUB-200-2011 demonstrate that our proposed method is effective in selecting a better base dataset.

algorithm, base class, novel class, (13 more...)

2004.00315

Country:

North America > United States > California (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)

Vlaski, Stefan, Sayed, Ali H.

Second-Order Guarantees in Centralized, Federated and Decentralized Nonconvex Optimization

arXiv.org Machine LearningMar-31-2020

Rapid advances in data collection and processing capabilities have allowed for the use of increasingly complex models that give rise to nonconvex optimization problems. These formulations, however, can be arbitrarily difficult to solve in general, in the sense that even simply verifying that a given point is a local minimum can be NPhard [1]. Still, some relatively simple algorithms have been shown to lead to surprisingly good empirical results in many contexts of interest. Perhaps the most prominent example is the success of the backpropagation algorithm for training neural networks. Several recent works have pursued rigorous analytical justification for this phenomenon by studying the structure of the nonconvex optimization problems and establishing that simple algorithms, such as gradient descent and its variations, perform well in converging towards local minima and avoiding saddle-points. A key insight in these analyses is that gradient perturbations play a critical role in allowing local descent algorithms to efficiently distinguish desirable from undesirable stationary points and escape from the latter. In this article, we cover recent results on second-order guarantees for stochastic first-order optimization algorithms in centralized, federated, and decentralized architectures. A key desirable feature of automated learning algorithms is the ability to learn models directly from data with minimal need for direct intervention by the designer. The authors are with the Institute of Electrical Engineering, École Polytechnique Fédérale de Lausanne.

algorithm, second-order guarantee, stationary point, (12 more...)

2003.14366

Country:

Europe > Switzerland > Vaud > Lausanne (0.24)
Asia > Middle East > Jordan (0.04)
North America > Canada > Quebec > Montreal (0.04)
(5 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.51)

Coley, Connor W., Eyke, Natalie S., Jensen, Klavs F.

Autonomous discovery in the chemical sciences part I: Progress

arXiv.org Artificial IntelligenceMar-30-2020

This two-part review examines how automation has contributed to different aspects of discovery in the chemical sciences. In this first part, we describe a classification for discoveries of physical matter (molecules, materials, devices), processes, and models and how they are unified as search problems. We then introduce a set of questions and considerations relevant to assessing the extent of autonomy. Finally, we describe many case studies of discoveries accelerated by or resulting from computer assistance and automation from the domains of synthetic chemistry, drug discovery, inorganic chemistry, and materials science. These illustrate how rapid advancements in hardware automation and machine learning continue to transform the nature of experimentation and modelling. Part two reflects on these case studies and identifies a set of open challenges for the field.

chem, scientific discovery, upstream oil & gas, (25 more...)

arXiv.org Artificial Intelligence

doi: 10.1002/anie.201909987

2003.13754

Country:

Europe > Germany (0.27)
Asia > Middle East (0.27)
Africa (0.27)
(4 more...)

Genre:

Workflow (1.00)
Research Report > Experimental Study (0.45)

Industry:

Materials > Chemicals (1.00)
Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Energy > Oil & Gas > Upstream (0.45)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(5 more...)

Deng, Yuyang, Kamani, Mohammad Mahdi, Mahdavi, Mehrdad

Adaptive Personalized Federated Learning

Investigation of the degree of personalization in federated learning algorithms has shown that only maximizing the performance of the global model will confine the capacity of the local models to personalize. In this paper, we advocate an adaptive personalized federated learning (APFL) algorithm, where each client will train their local models while contributing to the global model. Theoretically, we show that the mixture of local and global models can reduce the generalization error, using the multi-domain learning theory. We also propose a communication-reduced bilevel optimization method, which reduces the communication rounds to $O(\sqrt{T})$ and show that under strong convexity and smoothness assumptions, the proposed algorithm can achieve a convergence rate of $O(1/T)$ with some residual error. The residual error is related to the gradient diversity among local models, and the gap between optimal local and global models.

artificial intelligence, global model, machine learning, (17 more...)

2003.13461

Country:

North America > United States > Virginia (0.04)
North America > United States > Pennsylvania (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.82)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)

Revisiting "Over-smoothing" in Deep GCNs

Yang, Chaoqi, Wang, Ruijie, Yao, Shuochao, Liu, Shengzhong, Abdelzaher, Tarek

Oversmoothing has been assumed to be the major cause of performance drop in deep graph convolutional networks (GCNs). The evidence is usually derived from Simple Graph Convolution (SGC), a linear variant of GCNs. In this paper, we revisit graph node classification from an optimization perspective and argue that GCNs can actually learn anti-oversmoothing, whereas overfitting is the real obstacle in deep GCNs. This work interprets GCNs and SGCs as two-step optimization problems and provides the reason why deep SGC suffers from oversmoothing but deep GCNs does not. Our conclusion is compatible with the previous understanding of SGC, but we clarify why the same reasoning does not apply to GCNs. Based on our formulation, we provide more insights into the convolution operator and further propose a mean-subtraction trick to accelerate the training of deep GCNs. We verify our theory and propositions on three graph benchmarks. The experiments show that (i) in GCN, overfitting leads to the performance drop and oversmoothing does not exist even model goes to very deep (100 layers); (ii) mean-subtraction speeds up the model convergence as well as retains the same expressive power; (iii) the weight of neighbor averaging (1 is the common setting) does not significantly affect the model performance once it is above the threshold ( 0.5).

arxiv preprint arxiv, deep gcn, gcn, (12 more...)

2003.13663

Country:

North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > United States > Illinois > Champaign County > Champaign (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Andersson, Leif Erik, Doekemeijer, Bart, van der Hoek, Daan, van Wingerden, Jan-Willem, Imsland, Lars

Adaptation of Engineering Wake Models using Gaussian Process Regression and High-Fidelity Simulation Data

This article investigates the optimization of yaw control inputs of a nine-turbine wind farm. The wind farm is simulated using the high-fidelity simulator SOWFA. The optimization is performed with a modifier adaptation scheme based on Gaussian processes. Modifier adaptation corrects for the mismatch between plant and model and helps to converge to the actual plan optimum. In the case study the modifier adaptation approach is compared with the Bayesian optimization approach. Moreover, the use of two different covariance functions in the Gaussian process regression is discussed. Practical recommendations concerning the data preparation and application of the approach are given. It is shown that both the modifier adaptation and the Bayesian optimization approach can improve the power production with overall smaller yaw misalignments in comparison to the Gaussian wake model.

ma-gp approach, power production, turbine, (13 more...)

2003.13323

Country:

Europe > Netherlands > South Holland > Delft (0.05)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > Norway > Central Norway > Trøndelag > Trondheim (0.04)

Genre: Research Report (1.00)

Industry: Energy > Renewable > Wind (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Andonie, Razvan, Florea, Adrian-Catalin

Weighted Random Search for CNN Hyperparameter Optimization

Nearly all model algorithms used in machine learning use two different sets of parameters: the training parameters and the meta-parameters (hyperparameters). While the training parameters are learned during the training phase, the values of the hyperparameters have to be specified before learning starts. For a given dataset, we would like to find the optimal combination of hyperparameter values, in a reasonable amount of time. This is a challenging task because of its computational complexity. In previous work [11], we introduced the Weighted Random Search (WRS) method, a combination of Random Search (RS) and probabilistic greedy heuristic. In the current paper, we compare the WRS method with several state-of-the art hyperparameter optimization methods with respect to Convolutional Neural Network (CNN) hyperparameter optimization. The criterion is the classification accuracy achieved within the same number of tested combinations of hyperparameter values. According to our experiments, the WRS algorithm outperforms the other methods.

hyperparameter, hyperparameter optimization, optimization, (16 more...)

doi: 10.15837/ijccc.2020.2.3868

2003.133

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Ontario > Toronto (0.04)
North America > United States > New York > New York County > New York City (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry: Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.91)

Martínez-Álvarez, F., Asencio-Cortés, G., Torres, J. F., Gutiérrez-Avilés, D., Melgar-García, L., Pérez-Chacón, R., Rubio-Escudero, C., Riquelme, J. C., Troncoso, A.

Coronavirus Optimization Algorithm: A bioinspired metaheuristic based on the COVID-19 propagation model

arXiv.org Artificial IntelligenceMar-30-2020

A novel bioinspired metaheuristic is proposed in this work, simulating how the Coronavirus spreads and infects healthy people. From an initial individual (the patient zero), the coronavirus infects new patients at known rates, creating new populations of infected people. Every individual can either die or infect and, afterwards, be sent to the recovered population. Relevant terms such as re-infection probability, super-spreading rate or traveling rate are introduced in the model in order to simulate as accurately as possible the coronavirus activity. The Coronavirus Optimization Algorithm has two major advantages compared to other similar strategies. First, the input parameters are already set according to the disease statistics, preventing researchers from initializing them with arbitrary values. Second, the approach has the ability of ending after several iterations, without setting this value either. Infected population initially grows at an exponential rate but after some iterations, the high number recovered and dead people starts decreasing the number of infected people in new iterations. As application case, it has been used to train a deep learning model for electricity load forecasting, showing quite remarkable results after few iterations.

codification, infected individual, iteration, (13 more...)

arXiv.org Artificial Intelligence

2003.13633

Country:

Europe > Spain > Andalusia > Seville Province > Seville (0.04)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Europe > Italy (0.04)
Asia > China > Hubei Province > Wuhan (0.04)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)