Oceania
Direction Concentration Learning: Enhancing Congruency in Machine Learning
Luo, Yan, Wong, Yongkang, Kankanhalli, Mohan S., Zhao, Qi
One of the well-known challenges in computer vision tasks is the visual diversity of images, which could result in an agreement or disagreement between the learned knowledge and the visual content exhibited by the current observation. In this work, we first define such an agreement in a concepts learning process as congruency. Formally, given a particular task and sufficiently large dataset, the congruency issue occurs in the learning process whereby the task-specific semantics in the training data are highly varying. We propose a Direction Concentration Learning (DCL) method to improve congruency in the learning process, where enhancing congruency influences the convergence path to be less circuitous. The experimental results show that the proposed DCL method generalizes to state-of-the-art models and optimizers, as well as improves the performances of saliency prediction task, continual learning task, and classification task. Moreover, it helps mitigate the catastrophic forgetting problem in the continual learning task. The code is publicly available at https://github.com/luoyan407/congruency.
govtech_2019-12-22_23-08-52.xlsx
The graph represents a network of 3,290 Twitter users whose tweets in the requested range contained "govtech", or who were replied to or mentioned in those tweets. The network was obtained from the NodeXL Graph Server on Monday, 23 December 2019 at 07:09 UTC. The requested start date was Monday, 23 December 2019 at 01:01 UTC and the maximum number of days (going backward) was 14. The maximum number of tweets collected was 5,000. The tweets in the network were tweeted over the 13-day, 9-hour, 49-minute period from Monday, 09 December 2019 at 01:50 UTC to Sunday, 22 December 2019 at 11:39 UTC.
Neural Networks with Cheap Differential Operators
Chen, Ricky T. Q., Duvenaud, David K.
Gradients of neural networks can be computed efficiently for any architecture, but some applications require differential operators with higher time complexity. We describe a family of restricted neural network architectures that allow efficient computation of a family of differential operators involving dimension-wise derivatives, used in cases such as computing the divergence. Our proposed architecture has a Jacobian matrix composed of diagonal and hollow (non-diagonal) components. We can then modify the backward computation graph to extract dimension-wise derivatives efficiently with automatic differentiation. We demonstrate these cheap differential operators for solving root-finding subproblems in implicit ODE solvers, exact density evaluation for continuous normalizing flows, and evaluating the Fokker-Planck equation for training stochastic differential equation models.
The Parameterized Complexity of Cascading Portfolio Scheduling
Eiben, Eduard, Ganian, Robert, Kanj, Iyad, Szeider, Stefan
Cascading portfolio scheduling is a static algorithm selection strategy which uses a sample of test instances to compute an optimal ordering (a cascading schedule) of a portfolio of available algorithms. The algorithms are then applied to each future instance according to this cascading schedule, until some algorithm in the schedule succeeds. Cascading algorithm scheduling has proven to be effective in several applications, including QBF solving and the generation of ImageNet classification models. It is known that the computation of an optimal cascading schedule in the offline phase is NP-hard. In this paper we study the parameterized complexity of this problem and establish its fixed-parameter tractability by utilizing structural properties of the success relation between algorithms and test instances. Our findings are significant as they reveal that in spite of the intractability of the problem in its general form, one can indeed exploit sparseness or density of the success relation to obtain non-trivial runtime guarantees for finding an optimal cascading schedule.
Latent Ordinary Differential Equations for Irregularly-Sampled Time Series
Rubanova, Yulia, Chen, Ricky T. Q., Duvenaud, David K.
Time series with non-uniform intervals occur in many applications, and are difficult to model using standard recurrent neural networks (RNNs). We generalize RNNs to have continuous-time hidden dynamics defined by ordinary differential equations (ODEs), a model we call ODE-RNNs. Furthermore, we use ODE-RNNs to replace the recognition network of the recently-proposed Latent ODE model. Both ODE-RNNs and Latent ODEs can naturally handle arbitrary time gaps between observations, and can explicitly model the probability of observation times using Poisson processes. We show experimentally that these ODE-based models outperform their RNN-based counterparts on irregularly-sampled data.
A Dynamic Sampling Adaptive-SGD Method for Machine Learning
Bahamou, Achraf, Goldfarb, Donald
We propose a stochastic optimization method for minimizing loss functions, which can be expressed as an expected value, that adap-tively controls the batch size used in the computation of gradient approximations and the step size used to move along such directions, eliminating the need for the user to tune the learning rate. The proposed method exploits local curvature information and ensures that search directions are descent directions with high probability using an acute-angle test. The method is proved to have, under reasonable assumptions, a global linear rate of convergence on self-concordant functions with high probability. Numerical experiments show that this method is able to choose the best learning rates and compares favorably to fine-tuned SGD for training logistic regression and Deep Neural Networks (DNNs). We also propose an adaptive version of ADAM that eliminates the need to tune the base learning rate and compares favorably to fine-tuned ADAM for training DNNs.
IT Industry Outlook 2020
A t the beginning of a new decade, these lines from the play Inherit the Wind seem as appropriate for the technology industry as they did for the debate over evolution taking place in the drama. The tech industry is faced with a tricky balancing act: continuing to drive innovative solutions while grappling with the side effects of those solutions in the global economy. The challenge itself is not unique--every industry deals with this tension as it becomes more mature--but the new variables here are the scale that tech is able to achieve and the evolutionary aspect of mixing digital and physical worlds. It's time for the industry to take the next step. There are tremendous benefits available through technology for both business and society, but there are major questions around safety, privacy, sustainability, and trust. The answers to these questions come from combining technical expertise with social awareness. By embracing responsibility for all the changes innovation can bring, the tech industry can be responsible for driving future progress. Learn how people are re-envisioning the functions, processes, and best practices for infrastructure, development, security, and data in their organizations. The IT channel is in flux. This report highlights where today's providers see opportunities and challenges, embrace new technologies, and react to new rivals. This research explores the relevance of technology to SMBs and the factors affecting perceptions, decisions, and investments in established and emerging technologies. Thanks to the vast influx of user-friendly technologies, it has become popular to say that every company is a tech company. But the ubiquity of technology does not necessarily change the underlying business model. While digital transformation is creating new avenues for growth, companies are finding that they cannot simply slap tech labels on their products and practices and automatically reap benefits. On one end of the spectrum, this takes the shape of larger companies going public and struggling with the realities of their industry.
Neural Architecture Search on Acoustic Scene Classification
Li, Jixiang, Liang, Chuming, Zhang, Bo, Wang, Zhao, Xiang, Fei, Chu, Xiangxiang
Convolutional neural networks are widely adopted in Acoustic Scene Classification (ASC) tasks, but they generally carry a heavy computational burden. In this work, we propose a lightweight yet high-performing baseline network inspired by MobileNetV2, which replaces square convolutional kernels with unidirectional ones to extract features alternately in temporal and frequency dimensions. Furthermore, we explore a dynamic architecture space built on the basis of the proposed baseline with the recent Neural Architecture Search (NAS) paradigm, which first trains a supernet that incorporates all candidate networks and then applies a well-known evolutionary algorithm NSGA-II to discover more efficient networks with higher accuracy and lower computational cost. Experimental results demonstrate that our searched network is competent in ASC tasks, which achieves 90.3% F1-score on the DCASE2018 task 5 evaluation set, marking a new state-of-the-art performance while saving 25% of FLOPs compared to our baseline network.
Is Attention All What You Need? -- An Empirical Investigation on Convolution-Based Active Memory and Self-Attention
Dowdell, Thomas, Zhang, Hongyu
The key to a Transformer model is the self-attention mechanism, which allows the model to analyze an entire sequence in a computationally efficient manner. Recent work has suggested the possibility that general attention mechanisms used by RNNs could be replaced by active-memory mechanisms. In this work, we evaluate whether various active-memory mechanisms could replace self-attention in a Transformer. Our experiments suggest that active-memory alone achieves comparable results to the self-attention mechanism for language modelling, but optimal results are mostly achieved by using both active-memory and self-attention mechanisms together. We also note that, for some specific algorithmic tasks, active-memory mechanisms alone outperform both the self attention and a combination of the two.
On Quantified Modal Theorem Proving for Modeling Ethics
Govindarajulu, Naveen Sundar, Bringsjord, Selmer, Peveler, Matthew
Second International Workshop on Automated Reasoning: Challenges, Applications, Directions, Exemplary Achievements (ARCADE 2019) EPTCS 311, 2019, pp. In the last decade, formal logics have been used to model a wide range of ethical theories and principles with the goal of using these models within autonomous systems. Logics for modeling ethical theories, and their automated reasoners, have requirements that are different from modal logics used for other purposes, e.g. for temporal reasoning. Particularly, a quantified modal logic, the deontic cognitive event calculus (DC E C), has been used to model various versions of the doctrine of double effect, akrasia, and virtue ethics. Using a fragment of DC E C, we outline these distinct characteristics and present a sketches of an algorithm that can help with some aspects proof automation forDC E C . 1 Introduction Modal logics have been used for decades to model and study a diverse set of subjects -- e.g.