South America
Neural Networks with Complex-Valued Weights Have No Spurious Local Minima
We study the benefits of complex-valued weights for neural networks. We prove that shallow complex neural networks with quadratic activations have no spurious local minima. In contrast, shallow real neural networks with quadratic activations have infinitely many spurious local minima under the same conditions. In addition, we provide specific examples to demonstrate that complex-valued weights turn poor local minima into saddle points. The activation function CReLU is also discussed to illustrate the superiority of analytic activations in complex-valued neural networks.
Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks
Hoefler, Torsten, Alistarh, Dan, Ben-Nun, Tal, Dryden, Nikoli, Peste, Alexandra
The growing energy and performance costs of deep learning have driven the community to reduce the size of neural networks by selectively pruning components. Similarly to their biological counterparts, sparse networks generalize just as well, if not better than, the original dense networks. Sparsity can reduce the memory footprint of regular networks to fit mobile devices, as well as shorten training time for ever growing networks. In this paper, we survey prior work on sparsity in deep learning and provide an extensive tutorial of sparsification for both inference and training. We describe approaches to remove and add elements of neural networks, different training strategies to achieve model sparsity, and mechanisms to exploit sparsity in practice. Our work distills ideas from more than 300 research papers and provides guidance to practitioners who wish to utilize sparsity today, as well as to researchers whose goal is to push the frontier forward. We include the necessary background on mathematical methods in sparsification, describe phenomena such as early structure adaptation, the intricate relations between sparsity and the training process, and show techniques for achieving acceleration on real hardware. We also define a metric of pruned parameter efficiency that could serve as a baseline for comparison of different sparse networks. We close by speculating on how sparsity can improve future workloads and outline major open problems in the field.
Using Multiple Generative Adversarial Networks to Build Better-Connected Levels for Mega Man
Capps, Benjamin, Schrum, Jacob
Generative Adversarial Networks (GANs) can generate levels for a variety of games. This paper focuses on combining GAN-generated segments in a snaking pattern to create levels for Mega Man. Adjacent segments in such levels can be orthogonally adjacent in any direction, meaning that an otherwise fine segment might impose a barrier between its neighbor depending on what sorts of segments in the training set are being most closely emulated: horizontal, vertical, or corner segments. To pick appropriate segments, multiple GANs were trained on different types of segments to ensure better flow between segments. Flow was further improved by evolving the latent vectors for the segments being joined in the level to maximize the length of the level's solution path. Using multiple GANs to represent different types of segments results in significantly longer solution paths than using one GAN for all segment types, and a human subject study verifies that these levels are more fun and have more human-like design than levels produced by one GAN.
An evolutionary view on the emergence of Artificial Intelligence
Leusin, Matheus E., Jindra, Bjoern, Hain, Daniel S.
This paper draws upon the evolutionary concepts of technological relatedness and knowledge complexity to enhance our understanding of the long-term evolution of Artificial Intelligence (AI). We reveal corresponding patterns in the emergence of AI - globally and in the context of specific geographies of the US, Japan, South Korea, and China. We argue that AI emergence is associated with increasing related variety due to knowledge commonalities as well as increasing complexity. We use patent-based indicators for the period between 1974-2018 to analyse the evolution of AI's global technological space, to identify its technological core as well as changes to its overall relatedness and knowledge complexity. At the national level, we also measure countries' overall specialisations against AI-specific ones. At the global level, we find increasing overall relatedness and complexity of AI. However, for the technological core of AI, which has been stable over time, we find decreasing related variety and increasing complexity. This evidence points out that AI innovations related to core technologies are becoming increasingly distinct from each other. At the country level, we find that the US and Japan have been increasing the overall relatedness of their innovations. The opposite is the case for China and South Korea, which we associate with the fact that these countries are overall less technologically developed than the US and Japan. Finally, we observe a stable increasing overall complexity for all countries apart from China, which we explain by the focus of this country in technologies not strongly linked to AI.
A Review on Deep Learning in UAV Remote Sensing
Osco, Lucas Prado, Junior, José Marcato, Ramos, Ana Paula Marques, Jorge, Lúcio André de Castro, Fatholahi, Sarah Narges, Silva, Jonathan de Andrade, Matsubara, Edson Takashi, Pistori, Hemerson, Gonçalves, Wesley Nunes, Li, Jonathan
Deep Neural Networks (DNNs) learn representation from data with an impressive capability, and brought important breakthroughs for processing images, time-series, natural language, audio, video, and many others. In the remote sensing field, surveys and literature revisions specifically involving DNNs algorithms' applications have been conducted in an attempt to summarize the amount of information produced in its subfields. Recently, Unmanned Aerial Vehicles (UAV) based applications have dominated aerial sensing research. However, a literature revision that combines both "deep learning" and "UAV remote sensing" thematics has not yet been conducted. The motivation for our work was to present a comprehensive review of the fundamentals of Deep Learning (DL) applied in UAV-based imagery. We focused mainly on describing classification and regression techniques used in recent applications with UAV-acquired data. For that, a total of 232 papers published in international scientific journal databases was examined. We gathered the published material and evaluated their characteristics regarding application, sensor, and technique used. We relate how DL presents promising results and has the potential for processing tasks associated with UAV-based image data. Lastly, we project future perspectives, commentating on prominent DL paths to be explored in the UAV remote sensing field. Our revision consists of a friendly-approach to introduce, commentate, and summarize the state-of-the-art in UAV-based image applications with DNNs algorithms in diverse subfields of remote sensing, grouping it in the environmental, urban, and agricultural contexts.
Tree-based Node Aggregation in Sparse Graphical Models
High-dimensional graphical models are often estimated using regularization that is aimed at reducing the number of edges in a network. In this work, we show how even simpler networks can be produced by aggregating the nodes of the graphical model. We develop a new convex regularized method, called the tree-aggregated graphical lasso or tag-lasso, that estimates graphical models that are both edge-sparse and node-aggregated. The aggregation is performed in a data-driven fashion by leveraging side information in the form of a tree that encodes node similarity and facilitates the interpretation of the resulting aggregated nodes. We provide an efficient implementation of the tag-lasso by using the locally adaptive alternating direction method of multipliers and illustrate our proposal's practical advantages in simulation and in applications in finance and biology.
Covariance Prediction via Convex Optimization
We consider the problem of predicting the covariance of a zero mean Gaussian vector, based on another feature vector. We describe a covariance predictor that has the form of a generalized linear model, i.e., an affine function of the features followed by an inverse link function that maps vectors to symmetric positive definite matrices. The log-likelihood is a concave function of the predictor parameters, so fitting the predictor involves convex optimization. Such predictors can be combined with others, or recursively applied to improve performance.
Deep learning via LSTM models for COVID-19 infection forecasting in India
Chandra, Rohitash, Jain, Ayush, Chauhan, Divyanshu Singh
We have entered an era of a pandemic that has shaken the world with major impact to medical systems, economics and agriculture. Prominent computational and mathematical models have been unreliable due to the complexity of the spread of infections. Moreover, lack of data collection and reporting makes any such modelling attempts unreliable. Hence we need to re-look at the situation with the latest data sources and most comprehensive forecasting models. Deep learning models such as recurrent neural networks are well suited for modelling temporal sequences. In this paper, prominent recurrent neural networks, in particular \textit{long short term memory} (LSTMs) networks, bidirectional LSTM, and encoder-decoder LSTM models for multi-step (short-term) forecasting the spread of COVID-infections among selected states in India. We select states with COVID-19 hotpots in terms of the rate of infections and compare with states where infections have been contained or reached their peak and provide two months ahead forecast that shows that cases will slowly decline. Our results show that long-term forecasts are promising which motivates the application of the method in other countries or areas. We note that although we made some progress in forecasting, the challenges in modelling remain due to data and difficulty in capturing factors such as population density, travel logistics, and social aspects such culture and lifestyle.
A Survey on Personality-Aware Recommendation Systems
Dhelim, Sahraoui, Aung, Nyothiri, Bouras, Mohammed Amine, Ning, Huansheng, Cambria, Erik
With the emergence of personality computing as a new research field related to artificial intelligence and personality psychology, we have witnessed an unprecedented proliferation of personality-aware recommendation systems. Unlike conventional recommendation systems, these new systems solve traditional problems such as the cold start and data sparsity problems. This survey aims to study and systematically classify personality-aware recommendation systems. To the best of our knowledge, this survey is the first that focuses on personality-aware recommendation systems. We explore the different design choices of personality-aware recommendation systems, by comparing their personality modeling methods, as well as their recommendation techniques. Furthermore, we present the commonly used datasets and point out some of the challenges of personality-aware recommendation systems.
PIG-Net: Inception based Deep Learning Architecture for 3D Point Cloud Segmentation
Hegde, Sindhu, Gangisetty, Shankar
Point clouds, being the simple and compact representation of surface geometry of 3D objects, have gained increasing popularity with the evolution of deep learning networks for classification and segmentation tasks. Unlike human, teaching the machine to analyze the segments of an object is a challenging task and quite essential in various machine vision applications. In this paper, we address the problem of segmentation and labelling of the 3D point clouds by proposing a inception based deep network architecture called PIG-Net, that effectively characterizes the local and global geometric details of the point clouds. In PIG-Net, the local features are extracted from the transformed input points using the proposed inception layers and then aligned by feature transform. These local features are aggregated using the global average pooling layer to obtain the global features. Finally, feed the concatenated local and global features to the convolution layers for segmenting the 3D point clouds. We perform an exhaustive experimental analysis of the PIG-Net architecture on two state-of-the-art datasets, namely, ShapeNet [1] and PartNet [2]. We evaluate the effectiveness of our network by performing ablation study.