AITopics

doi: 10.1002/anie.201909987

2003.13754

Country:

Europe > Germany (0.27)
Asia > Middle East (0.27)
Africa (0.27)
(4 more...)

Genre:

Workflow (1.00)
Research Report > Experimental Study (0.45)

Industry:

Materials > Chemicals (1.00)
Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Energy > Oil & Gas > Upstream (0.45)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(5 more...)

Andonie, Razvan, Florea, Adrian-Catalin

Weighted Random Search for CNN Hyperparameter Optimization

arXiv.org Machine LearningMar-30-2020

Nearly all model algorithms used in machine learning use two different sets of parameters: the training parameters and the meta-parameters (hyperparameters). While the training parameters are learned during the training phase, the values of the hyperparameters have to be specified before learning starts. For a given dataset, we would like to find the optimal combination of hyperparameter values, in a reasonable amount of time. This is a challenging task because of its computational complexity. In previous work [11], we introduced the Weighted Random Search (WRS) method, a combination of Random Search (RS) and probabilistic greedy heuristic. In the current paper, we compare the WRS method with several state-of-the art hyperparameter optimization methods with respect to Convolutional Neural Network (CNN) hyperparameter optimization. The criterion is the classification accuracy achieved within the same number of tested combinations of hyperparameter values. According to our experiments, the WRS algorithm outperforms the other methods.

hyperparameter, hyperparameter optimization, optimization, (16 more...)

doi: 10.15837/ijccc.2020.2.3868

2003.133

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Ontario > Toronto (0.04)
North America > United States > New York > New York County > New York City (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry: Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.91)

arXiv.org Machine LearningMar-29-2020

A General Large Neighborhood Search Framework for Solving Integer Programs

Song, Jialin, Lanka, Ravi, Yue, Yisong, Dilkina, Bistra

This paper studies how to design abstractions of large-scale combinatorial optimization problems that can leverage existing state-of-the-art solvers in general purpose ways, and that are amenable to data-driven design. The goal is to arrive at new approaches that can reliably outperform existing solvers in wall-clock time. We focus on solving integer programs, and ground our approach in the large neighborhood search (LNS) paradigm, which iteratively chooses a subset of variables to optimize while leaving the remainder fixed. The appeal of LNS is that it can easily use any existing solver as a subroutine, and thus can inherit the benefits of carefully engineered heuristic approaches and their software implementations. We also show that one can learn a good neighborhood selector from training data. Through an extensive empirical validation, we demonstrate that our LNS framework can significantly outperform, in wall-clock time, compared to state-of-the-art commercial solvers such as Gurobi.

decomposition, optimization problem, solver, (15 more...)

2004.00422

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Transportation (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.90)

arXiv.org Machine LearningMar-27-2020

Incorporating Expert Prior in Bayesian Optimisation via Space Warping

Ramachandran, Anil, Gupta, Sunil, Rana, Santu, Li, Cheng, Venkatesh, Svetha

Bayesian optimisation is a well-known sample-efficient method for the optimisation of expensive black-box functions. However when dealing with big search spaces the algorithm goes through several low function value regions before reaching the optimum of the function. Since the function evaluations are expensive in terms of both money and time, it may be desirable to alleviate this problem. One approach to subside this cold start phase is to use prior knowledge that can accelerate the optimisation. In its standard form, Bayesian optimisation assumes the likelihood of any point in the search space being the optimum is equal. Therefore any prior knowledge that can provide information about the optimum of the function would elevate the optimisation performance. In this paper, we represent the prior knowledge about the function optimum through a prior distribution. The prior distribution is then used to warp the search space in such a way that space gets expanded around the high probability region of function optimum and shrinks around low probability region of optimum. We incorporate this prior directly in function model (Gaussian process), by redefining the kernel matrix, which allows this method to work with any acquisition function, i.e. acquisition agnostic approach. We show the superiority of our method over standard Bayesian optimisation method through optimisation of several benchmark functions and hyperparameter tuning of two algorithms: Support Vector Machine (SVM) and Random forest.

acquisition function, bayesian optimisation, optimisation, (14 more...)

doi: 10.1016/j.knosys.2020.105663

2003.1225

Country:

Oceania > Australia (0.14)
North America > United States > North Carolina (0.04)
Asia (0.04)

Genre: Research Report (0.64)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.55)

arXiv.org Artificial IntelligenceMar-24-2020

Planning with Brain-inspired AI

Arakawa, Naoya

This article surveys engineering and neuroscientific models of planning as a cognitive function, which is regarded as a typical function of fluid intelligence in the discussion of general intelligence. It aims to present existing planning models as references for realizing the planning function in brain-inspired AI or artificial general intelligence (AGI). It also proposes themes for the research and development of brain-inspired AI from the viewpoint of tasks and architecture.

information, pfc, representation, (15 more...)

2003.12353

Country:

Asia > Vietnam > Hanoi > Hanoi (0.06)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > Colorado (0.04)

Genre:

Overview (0.48)
Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Neuroscience (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.94)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Painter, Michael, Lacerda, Bruno, Hawes, Nick

Convex Hull Monte-Carlo Tree Search

arXiv.org Artificial IntelligenceMar-23-2020

This work investigates Monte-Carlo planning for agents in stochastic environments, with multiple objectives. We propose the Convex Hull Monte-Carlo Tree-Search (CHMCTS) framework, which builds upon Trial Based Heuristic Tree Search and Convex Hull Value Iteration (CHVI), as a solution to multi-objective planning in large environments. Moreover, we consider how to pose the problem of approximating multiobjective planning solutions as a contextual multi-armed bandits problem, giving a principled motivation for how to select actions from the view of contextual regret. This leads us to the use of Contextual Zooming for action selection, yielding Zooming CHMCTS. We evaluate our algorithm using the Generalised Deep Sea Treasure environment, demonstrating that Zooming CHMCTS can achieve a sublinear contextual regret and scales better than CHVI on a given computational budget.

algorithm, node, objective, (16 more...)

2003.04445

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
Europe > Estonia > Tartu County > Tartu (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)

Tholeti, Thulasi, Kalyani, Sheetal

Tune smarter not harder: A principled approach to tuning learning rates for shallow nets

arXiv.org Machine LearningMar-22-2020

Effective hyper-parameter tuning is essential to guarantee the performance that neural networks have come to be known for. In this work, a principled approach to choosing the learning rate is proposed for shallow feedforward neural networks. We associate the learning rate with the gradient Lipschitz constant of the objective to be minimized while training. An upper bound on the mentioned constant is derived and a search algorithm, which always results in non-divergent traces, is proposed to exploit the derived bound. It is shown through simulations that the proposed search method significantly outperforms the existing tuning methods such as Tree Parzen Estimators (TPE). The proposed method is applied to two different existing applications, namely, channel estimation in a wireless communication system and prediction of the exchange currency rates, and it is shown to pick better learning rates than the existing methods using the same or lesser compute power.

algorithm, gradient lipschitz constant, neural network, (14 more...)

2003.09844

Country:

North America > United States > New York (0.04)
Asia > India > Tamil Nadu > Chennai (0.04)

Genre: Research Report (0.64)

Industry: Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningMar-22-2020

BS-NAS: Broadening-and-Shrinking One-Shot NAS with Searchable Numbers of Channels

Shen, Zan, Qian, Jiang, Zhuang, Bojin, Wang, Shaojun, Xiao, Jing

One-Shot methods have evolved into one of the most popular methods in Neural Architecture Search (NAS) due to weight sharing and single training of a supernet. However, existing methods generally suffer from two issues: predetermined number of channels in each layer which is suboptimal; and model averaging effects and poor ranking correlation caused by weight coupling and continuously expanding search space. To explicitly address these issues, in this paper, a Broadening-and-Shrinking One-Shot NAS (BS-NAS) framework is proposed, in which `broadening' refers to broadening the search space with a spring block enabling search for numbers of channels during training of the supernet; while `shrinking' refers to a novel shrinking strategy gradually turning off those underperforming operations. The above innovations broaden the search space for wider representation and then shrink it by gradually removing underperforming operations, followed by an evolutionary algorithm to efficiently search for the optimal architecture. Extensive experiments on ImageNet illustrate the effectiveness of the proposed BS-NAS as well as the state-of-the-art performance.

opération, search space, supernet, (14 more...)

2003.09821

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.79)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceMar-22-2020

Generalized Nested Rollout Policy Adaptation

Cazenave, Tristan

Nested Rollout Policy Adaptation (NRPA) is a Monte Carlo search algorithm for single player games. In this paper we propose to generalize NRPA with a temperature and a bias and to analyze theoretically the algorithms. The generalized algorithm is named GNRPA. Experiments show it improves on NRPA for different application domains: SameGame and the Traveling Salesman Problem with Time Windows.

algorithm, possible move, sequence, (15 more...)

2003.10024

Country:

North America > United States > Arizona > Maricopa County > Phoenix (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Spain > Galicia > Madrid (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.50)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.36)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.35)

Tu, Zhuozhuo, Zhang, Jingwei, Tao, Dacheng

Theoretical Analysis of Adversarial Learning: A Minimax Approach

Neural Information Processing SystemsMar-20-2020, 13:31:18 GMT

In this paper, we propose a general theoretical method for analyzing the risk bound in the presence of adversaries. Specifically, we try to fit the adversarial learning problem into the minimax framework. We first show that the original adversarial learning problem can be transformed into a minimax statistical learning problem by introducing a transport map between distributions. Then, we prove a new risk bound for this minimax problem in terms of covering numbers under a weak version of Lipschitz condition. Our method can be applied to multi-class classification and popular loss functions including the hinge loss and ramp loss.

adversarial learning, learning problem, theoretical analysis, (2 more...)

Neural Information Processing Systems

Industry: Education > Focused Education > Special Education (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.93)