Oceania
futureofwork _2019-01-01_05-52-19.xlsx
The graph represents a network of 2,824 Twitter users whose tweets in the requested range contained "futureofwork ", or who were replied to or mentioned in those tweets. The network was obtained from the NodeXL Graph Server on Tuesday, 01 January 2019 at 13:53 UTC. The requested start date was Tuesday, 01 January 2019 at 01:01 UTC and the maximum number of days (going backward) was 14. The maximum number of tweets collected was 5,000. The tweets in the network were tweeted over the 1-day, 14-hour, 53-minute period from Sunday, 30 December 2018 at 10:06 UTC to Tuesday, 01 January 2019 at 01:00 UTC.
CSIRO, IBM and Zendesk Australia's top hirers for AI roles
Tech giants, universities, consultancy firms and a retailer are among Australia's top hirers for artificial intelligence related roles, according to analysis by job-site Indeed. The government's scientific research organisation CSIRO emerged as the top seeker of candidates for AI roles, measured by number of jobs requiring related skills posted to the website this year. In second place was IBM, followed by Zendesk, Culture Amp, Macquarie University, Google, Atlassian and BCG Digital Ventures. The rapid rise of the specialised field is leading to a'skills mismatch' said Indeed's APAC economist Callam Pickering. "That is, the skills in demand from employers are not necessarily the skills possessed by job seekers. This often happens because the skills in demand are rapidly changing, reflecting the needs to adapt to new technologies," he told CIO Australia.
A Survey on Multi-output Learning
Xu, Donna, Shi, Yaxin, Tsang, Ivor W., Ong, Yew-Soon, Gong, Chen, Shen, Xiaobo
Multi-output learning aims to simultaneously predict multiple outputs given an input. It is an important learning problem due to the pressing need for sophisticated decision making in real-world applications. Inspired by big data, the 4Vs characteristics of multi-output imposes a set of challenges to multi-output learning, in terms of the volume, velocity, variety and veracity of the outputs. Increasing number of works in the literature have been devoted to the study of multi-output learning and the development of novel approaches for addressing the challenges encountered. However, it lacks a comprehensive overview on different types of challenges of multi-output learning brought by the characteristics of the multiple outputs and the techniques proposed to overcome the challenges. This paper thus attempts to fill in this gap to provide a comprehensive review on this area. We first introduce different stages of the life cycle of the output labels. Then we present the paradigm on multi-output learning, including its myriads of output structures, definitions of its different sub-problems, model evaluation metrics and popular data repositories used in the study. Subsequently, we review a number of state-of-the-art multi-output learning methods, which are categorized based on the challenges.
AI In Academia: Much Potential, Much Resistance
Colleges and universities have started using virtual assistants, chatbots, and other intelligent software tools to augment or replace student interactions with advisers or counselors and to provide some institutional services. Yet many faculty members still resist supporting AI, especially when it comes to delivering course content. I enlisted my colleague Nicole Engelbert, Oracle vice president of higher educational development, to help me assess and position the role of AI in academia. What follows are excerpts from our recent discussion. Rajecki: When I started in higher ed, the concern was that online learning would replace classrooms.
Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation
You, Jiaxuan, Liu, Bowen, Ying, Zhitao, Pande, Vijay, Leskovec, Jure
Generating novel graph structures that optimize given objectives while obeying some given underlying rules is fundamental for chemistry, biology and social science research. This is especially important in the task of molecular graph generation, whose goal is to discover novel molecules with desired properties such as drug-likeness and synthetic accessibility, while obeying physical laws such as chemical valency. However, designing models that finds molecules that optimize desired properties while incorporating highly complex and non-differentiable rules remains to be a challenging task. Here we propose Graph Convolutional Policy Network (GCPN), a general graph convolutional network based model for goal-directed graph generation through reinforcement learning. The model is trained to optimize domain-specific rewards and adversarial loss through policy gradient, and acts in an environment that incorporates domain-specific rules. Experimental results show that GCPN can achieve 61% improvement on chemical property optimization over state-of-the-art baselines while resembling known molecules, and achieve 184% improvement on the constrained property optimization task.
Parameters as interacting particles: long time convergence and asymptotic error scaling of neural networks
Rotskoff, Grant, Vanden-Eijnden, Eric
The performance of neural networks on high-dimensional data distributions suggests that it may be possible to parameterize a representation of a given high-dimensional function with controllably small errors, potentially outperforming standard interpolation methods. We demonstrate, both theoretically and numerically, that this is indeed the case. We map the parameters of a neural network to a system of particles relaxing with an interaction potential determined by the loss function. We show that in the limit that the number of parameters $n$ is large, the landscape of the mean-squared error becomes convex and the representation error in the function scales as $O(n^{-1})$. In this limit, we prove a dynamical variant of the universal approximation theorem showing that the optimal representation can be attained by stochastic gradient descent, the algorithm ubiquitously used for parameter optimization in machine learning. In the asymptotic regime, we study the fluctuations around the optimal representation and show that they arise at a scale $O(n^{-1})$. These fluctuations in the landscape identify the natural scale for the noise in stochastic gradient descent. Our results apply to both single and multi-layer neural networks, as well as standard kernel methods like radial basis functions.
Integrated accounts of behavioral and neuroimaging data using flexible recurrent neural network models
Dezfouli, Amir, Morris, Richard, Ramos, Fabio T., Dayan, Peter, Balleine, Bernard
Neuroscience studies of human decision-making abilities commonly involve subjects completing a decision-making task while BOLD signals are recorded using fMRI. Hypotheses are tested about which brain regions mediate the effect of past experience, such as rewards, on future actions. One standard approach to this is model-based fMRI data analysis, in which a model is fitted to the behavioral data, i.e., a subject's choices, and then the neural data are parsed to find brain regions whose BOLD signals are related to the model's internal signals. However, the internal mechanics of such purely behavioral models are not constrained by the neural data, and therefore might miss or mischaracterize aspects of the brain. To address this limitation, we introduce a new method using recurrent neural network models that are flexible enough to be jointly fitted to the behavioral and neural data. We trained a model so that its internal states were suitably related to neural activity during the task, while at the same time its output predicted the next action a subject would execute. We then used the fitted model to create a novel visualization of the relationship between the activity in brain regions at different times following a reward and the choices the subject subsequently made. Finally, we validated our method using a previously published dataset. We found that the model was able to recover the underlying neural substrates that were discovered by explicit model engineering in the previous work, and also derived new results regarding the temporal pattern of brain activity.
Modular Networks: Learning to Decompose Neural Computation
Kirsch, Louis, Kunze, Julius, Barber, David
Scaling model capacity has been vital in the success of deep learning. For a typical network, necessary compute resources and training time grow dramatically with model size. Conditional computation is a promising way to increase the number of parameters with a relatively small increase in resources. We propose a training algorithm that flexibly chooses neural modules based on the data to be processed. Both the decomposition and modules are learned end-to-end. In contrast to existing approaches, training does not rely on regularization to enforce diversity in module use. We apply modular networks both to image recognition and language modeling tasks, where we achieve superior performance compared to several baselines. Introspection reveals that modules specialize in interpretable contexts.
Dirichlet belief networks for topic structure learning
Zhao, He, Du, Lan, Buntine, Wray, Zhou, Mingyuan
Recently, considerable research effort has been devoted to developing deep architectures for topic models to learn topic structures. Although several deep models have been proposed to learn better topic proportions of documents, how to leverage the benefits of deep structures for learning word distributions of topics has not yet been rigorously studied. Here we propose a new multi-layer generative process on word distributions of topics, where each layer consists of a set of topics and each topic is drawn from a mixture of the topics of the layer above. As the topics in all layers can be directly interpreted by words, the proposed model is able to discover interpretable topic hierarchies. As a self-contained module, our model can be flexibly adapted to different kinds of topic models to improve their modelling accuracy and interpretability. Extensive experiments on text corpora demonstrate the advantages of the proposed model.
Efficient Loss-Based Decoding on Graphs for Extreme Classification
Evron, Itay, Moroshko, Edward, Crammer, Koby
In extreme classification problems, learning algorithms are required to map instances to labels from an extremely large label set. We build on a recent extreme classification framework with logarithmic time and space (LTLS), and on a general approach for error correcting output coding (ECOC) with loss-based decoding, and introduce a flexible and efficient approach accompanied by theoretical bounds. Our framework employs output codes induced by graphs, for which we show how to perform efficient loss-based decoding to potentially improve accuracy. In addition, our framework offers a tradeoff between accuracy, model size and prediction time. We show how to find the sweet spot of this tradeoff using only the training data. Our experimental study demonstrates the validity of our assumptions and claims, and shows that our method is competitive with state-of-the-art algorithms.