Artificial Intelligence 2018: Build the Most Powerful AI - Learn, build and implement the most powerful AI model at home. Created by Hadelin de Ponteves, Kirill Eremenko, Ligency Team English [Auto], Indonesian [Auto] Preview this Course GET COUPON CODE Understand the theory behind augmented random search algorithm Learn how to build most powerful AI algorithm Train and implement ARS algorithm Train AI to solve same challenges as Google Deep Mind Two months ago we discovered that a very new kind of AI was invented. The kind of AI which is based on a genius idea and that you can build from scratch and without the need for any framework. We checked that out, we built it, and... the results are absolutely insane! This game-changing AI called Augmented Random Search, ARS for short.
Digitization is penetrating more and more areas of life. Tasks are increasingly being completed digitally, and are therefore not only fulfilled faster, more efficiently but also more purposefully and successfully. The rapid developments in the field of artificial intelligence in recent years have played a major role in this, as they brought up many helpful approaches to build on. At the same time, the eyes, their movements, and the meaning of these movements are being progressively researched. The combination of these developments has led to exciting approaches. In this dissertation, I present some of these approaches which I worked on during my Ph.D. First, I provide insight into the development of models that use artificial intelligence to connect eye movements with visual expertise. This is demonstrated for two domains or rather groups of people: athletes in decision-making actions and surgeons in arthroscopic procedures. The resulting models can be considered as digital diagnostic models for automatic expertise recognition. Furthermore, I show approaches that investigate the transferability of eye movement patterns to different expertise domains and subsequently, important aspects of techniques for generalization. Finally, I address the temporal detection of confusion based on eye movement data. The results suggest the use of the resulting model as a clock signal for possible digital assistance options in the training of young professionals. An interesting aspect of my research is that I was able to draw on very valuable data from DFB youth elite athletes as well as on long-standing experts in arthroscopy. In particular, the work with the DFB data attracted the interest of radio and print media, namely DeutschlandFunk Nova and SWR DasDing. All resulting articles presented here have been published in internationally renowned journals or at conferences.
Spatial-query-by-sketch is an intuitive tool to explore human spatial knowledge about geographic environments and to support communication with scene database queries. However, traditional sketch-based spatial search methods perform insufficiently due to their inability to find hidden multi-scale map features from mental sketches. In this research, we propose a deep convolutional neural network, namely Deep Spatial Scene Network (DeepSSN), to better assess the spatial scene similarity. In DeepSSN, a triplet loss function is designed as a comprehensive distance metric to support the similarity assessment. A positive and negative example mining strategy using qualitative constraint networks in spatial reasoning is designed to ensure a consistently increasing distinction of triplets during the training process. Moreover, we develop a prototype spatial scene search system using the proposed DeepSSN, in which the users input spatial query via sketch maps and the system can automatically augment the sketch training data. The proposed model is validated using multi-source conflated map data including 131,300 labeled scene samples after data augmentation. The empirical results demonstrate that the DeepSSN outperforms baseline methods including k-nearest-neighbors, multilayer perceptron, AlexNet, DenseNet, and ResNet using mean reciprocal rank and precision metrics. This research advances geographic information retrieval studies by introducing a novel deep learning method tailored to spatial scene queries.
We show how graph neural networks can be used to solve the canonical graph coloring problem. We frame graph coloring as a multi-class node classification problem and utilize an unsupervised training strategy based on the statistical physics Potts model. Generalizations to other multi-class problems such as community detection, data clustering, and the minimum clique cover problem are straightforward. We provide numerical benchmark results and illustrate our approach with an end-to-end application for a real-world scheduling use case within a comprehensive encode-process-decode framework. Our optimization approach performs on par or outperforms existing solvers, with the ability to scale to problems with millions of variables.
Continual learning (CL) (Ring, 1995; Thrun, 1995) is a branch of machine learning where the model is exposed to a sequence of tasks with the hope of exploiting existing knowledge to adapt quickly to new tasks. The research in continual learning has seen a surge in the past few years with the explicit focus of developing algorithms that can alleviate catastrophic forgetting (McCloskey and Cohen, 1989)--whereby the model abruptly forgets the information of the past when trained on new tasks. While most of the research in continual learning is focused on developing learning algorithms, that can perform better than naive fine-tuning on a stream of data, the role of model architecture, to the best of our knowledge, is not explicitly studied in any of the existing works. Even the class of parameter isolation or expansion-based methods, for example (Rusu et al., 2016; Yoon et al., 2018), have a cursory focus on the model architecture insofar that they assume a specific architecture and try to find an algorithm operating on the architecture. Orthogonal to this direction for designing algorithms, our motivation is that the inductive biases induced by different architectural components are important for continual learning. We seek to characterize the implication of different architectural choices. To motivate, consider a ResNet-18 model (He et al., 2016) on Split CIFAR-100, where CIFAR-100 dataset (Krizhevsky et al., 2009) is split into 20 disjoint sets--a prevalent architecture and benchmark in the existing continual learning works. Figure 1a shows that explicitly designed CL algorithms, EWC (Kirkpatrick et al., 2017) (a parameter regularization-based method) and experience replay (Riemer et al., Work done during an internship at DeepMind.
Bliek, Laurens, da Costa, Paulo, Afshar, Reza Refaei, Zhang, Yingqian, Catshoek, Tom, Vos, Daniël, Verwer, Sicco, Schmitt-Ulms, Fynn, Hottung, André, Shah, Tapan, Sellmann, Meinolf, Tierney, Kevin, Perreault-Lafleur, Carl, Leboeuf, Caroline, Bobbio, Federico, Pepin, Justine, Silva, Warley Almeida, Gama, Ricardo, Fernandes, Hugo L., Zaefferer, Martin, López-Ibáñez, Manuel, Irurozki, Ekhine
The TSP is one of the classical combinatorial optimization problems, with many variants inspired by real-world applications. This first competition asked the participants to develop algorithms to solve a time-dependent orienteering problem with stochastic weights and time windows (TD-OPSWTW). It focused on two types of learning approaches: surrogate-based optimization and deep reinforcement learning. In this paper, we describe the problem, the setup of the competition, the winning methods, and give an overview of the results. The winning methods described in this work have advanced the state-of-the-art in using AI for stochastic routing problems. Overall, by organizing this competition we have introduced routing problems as an interesting problem setting for AI researchers. The simulator of the problem has been made open-source and can be used by other researchers as a benchmark for new AI methods.
Liu, Zhengying, Pavao, Adrien, Xu, Zhen, Escalera, Sergio, Ferreira, Fabio, Guyon, Isabelle, Hong, Sirui, Hutter, Frank, Ji, Rongrong, Junior, Julio C. S. Jacques, Li, Ge, Lindauer, Marius, Luo, Zhipeng, Madadi, Meysam, Nierhoff, Thomas, Niu, Kangning, Pan, Chunguang, Stoll, Danny, Treguer, Sebastien, Wang, Jin, Wang, Peng, Wu, Chenglin, Xiong, Youcheng, Zela, Arbe r, Zhang, Yang
This paper reports the results and post-challenge analyses of ChaLearn's AutoDL challenge series, which helped sorting out a profusion of AutoML solutions for Deep Learning (DL) that had been introduced in a variety of settings, but lacked fair comparisons. All input data modalities (time series, images, videos, text, tabular) were formatted as tensors and all tasks were multi-label classification problems. Code submissions were executed on hidden tasks, with limited time and computational resources, pushing solutions that get results quickly. In this setting, DL methods dominated, though popular Neural Architecture Search (NAS) was impractical. Solutions relied on fine-tuned pre-trained networks, with architectures matching data modality. Post-challenge tests did not reveal improvements beyond the imposed time limit. While no component is particularly original or novel, a high level modular organization emerged featuring a "meta-learner", "data ingestor", "model selector", "model/learner", and "evaluator". This modularity enabled ablation studies, which revealed the importance of (off-platform) meta-learning, ensembling, and efficient data management. Experiments on heterogeneous module combinations further confirm the (local) optimality of the winning solutions. Our challenge legacy includes an ever-lasting benchmark (http://autodl.chalearn.org), the open-sourced code of the winners, and a free "AutoDL self-service".
Increasing complexity comes from some factors including uncertainty, ambiguity, inconsistency, multiple dimensionalities, increasing the number of effective factors and relation between them. Some of these features are common among most real-world problems which are considered complex and dynamic problems. In other words, since the data and relations in real world applications are usually highly complex and inaccurate, modeling real complex systems based on observed data is a challenging task especially for large scale, inaccurate and non stationary datasets. Therefore, to cover and address these difficulties, the existence of a computational system with the capability of extracting knowledge from the complex system with the ability to simulate its behavior is essential. In other words, it is needed to find a robust approach and solution to handle real complex problems in an easy and meaningful way . Hard computing methods depend on quantitative values with expensive solutions and lack of ability to represent the problem in real life due to some uncertainties. In contrast, soft computing approaches act as alternative tools to deal with the reasoning of complex problems . Using soft computing methods such as fuzzy logic, neural network, genetic algorithms or a combination of these allows achieving robustness, tractable and more practical solutions. Generally, two types of methods are used for analyzing and modeling dynamic systems including quantitative and qualitative approaches.
Artificial intelligence (AI) has become a part of everyday conversation and our lives. It is considered as the new electricity that is revolutionizing the world. AI is heavily invested in both industry and academy. However, there is also a lot of hype in the current AI debate. AI based on so-called deep learning has achieved impressive results in many problems, but its limits are already visible. AI has been under research since the 1940s, and the industry has seen many ups and downs due to over-expectations and related disappointments that have followed. The purpose of this book is to give a realistic picture of AI, its history, its potential and limitations. We believe that AI is a helper, not a ruler of humans. We begin by describing what AI is and how it has evolved over the decades. After fundamentals, we explain the importance of massive data for the current mainstream of artificial intelligence. The most common representations for AI, methods, and machine learning are covered. In addition, the main application areas are introduced. Computer vision has been central to the development of AI. The book provides a general introduction to computer vision, and includes an exposure to the results and applications of our own research. Emotions are central to human intelligence, but little use has been made in AI. We present the basics of emotional intelligence and our own research on the topic. We discuss super-intelligence that transcends human understanding, explaining why such achievement seems impossible on the basis of present knowledge,and how AI could be improved. Finally, a summary is made of the current state of AI and what to do in the future. In the appendix, we look at the development of AI education, especially from the perspective of contents at our own university.
Graph machine learning has been extensively studied in both academic and industry. However, as the literature on graph learning booms with a vast number of emerging methods and techniques, it becomes increasingly difficult to manually design the optimal machine learning algorithm for different graph-related tasks. To tackle the challenge, automated graph machine learning, which aims at discovering the best hyper-parameter and neural architecture configuration for different graph tasks/data without manual design, is gaining an increasing number of attentions from the research community. In this paper, we extensively discuss automated graph machine approaches, covering hyper-parameter optimization (HPO) and neural architecture search (NAS) for graph machine learning. We briefly overview existing libraries designed for either graph machine learning or automated machine learning respectively, and further in depth introduce AutoGL, our dedicated and the world's first open-source library for automated graph machine learning. Last but not least, we share our insights on future research directions for automated graph machine learning. This paper is the first systematic and comprehensive discussion of approaches, libraries as well as directions for automated graph machine learning.