Oceania
Deep learning method transforms shapes
Called LOGAN, the deep neural network, i.e., a machine of sorts, can learn to transform the shapes of two different objects, for example, a chair and a table, in a natural way, without seeing any paired transforms between the shapes. All the machine had seen was a bunch of tables and a bunch of chairs, and it could automatically translate shapes between the two unpaired domains. LOGAN can also automatically perform both content and style transfers between two different types of shapes without any changes to its network architecture. The team of researchers behind LOGAN, from Simon Fraser University, Shenzhen University, and Tel Aviv University, are set to present their work at ACM SIGGRAPH Asia held Nov. 17 to 20 in Brisbane, Australia. SIGGRAPH Asia, now in its 12th year, attracts the most respected technical and creative people from around the world in computer graphics, animation, interactivity, gaming, and emerging technologies. "Shape transform is one of the most fundamental and frequently encountered problems in computer graphics and geometric modeling," says senior coauthor of the work, Hao (Richard) Zhang, professor of computing science at Simon Fraser University.
How can AI Automate End-to-End Data Science?
Aggarwal, Charu, Bouneffouf, Djallel, Samulowitz, Horst, Buesser, Beat, Hoang, Thanh, Khurana, Udayan, Liu, Sijia, Pedapati, Tejaswini, Ram, Parikshit, Rawat, Ambrish, Wistuba, Martin, Gray, Alexander
Data science is labor-intensive and human experts are scarce but heavily involved in every aspect of it. This makes data science time consuming and restricted to experts with the resulting quality heavily dependent on their experience and skills. To make data science more accessible and scalable, we need its democratization. Automated Data Science (AutoDS) is aimed towards that goal and is emerging as an important research and business topic. We introduce and define the AutoDS challenge, followed by a proposal of a general AutoDS framework that covers existing approaches but also provides guidance for the development of new methods. We categorize and review the existing literature from multiple aspects of the problem setup and employed techniques. Then we provide several views on how AI could succeed in automating end-to-end AutoDS. We hope this survey can serve as insightful guideline for the AutoDS field and provide inspiration for future research.
Generalized Domain Adaptation with Covariate and Label Shift CO-ALignment
Tan, Shuhan, Peng, Xingchao, Saenko, Kate
Unsupervised knowledge transfer has a great potential to improve the generalizability of deep models to novel domains. Yet the current literature assumes that the label distribution is domain-invariant and only aligns the covariate or vice versa. In this paper, we explore the task of Generalized Domain Adaptation (GDA): How to transfer knowledge across different domains in the presence of both covariate and label shift? We propose a covariate and label distribution CO-ALignment (COAL) model to tackle this problem. Our model leverages prototype-based conditional alignment and label distribution estimation to diminish the covariate and label shifts, respectively. We demonstrate experimentally that when both types of shift exist in the data, COAL leads to state-of-the-art performance on several cross-domain benchmarks.
Stochastic Feedforward Neural Networks: Universal Approximation
Merkh, Thomas, Montúfar, Guido
In this chapter we take a look at the universal approximation question for stochastic feedforward neural networks. In contrast to deterministic networks, which represent mappings from a set of inputs to a set of outputs, stochastic networks represent mappings from a set of inputs to a set of probability distributions over the set of outputs. In particular, even if the sets of inputs and outputs are finite, the class of stochastic mappings in question is not finite. Moreover, while for a deterministic function the values of all output variables can be computed independently of each other given the values of the inputs, in the stochastic setting the values of the output variables may need to be correlated, which requires that their values are computed jointly. A prominent class of stochastic feedforward networks which has played a key role in the resurgence of deep learning are deep belief networks. The representational power of these networks has been studied mainly in the generative setting, as models of probability distributions without an input, or in the discriminative setting for the special case of deterministic mappings. We study the representational power of deep sigmoid belief networks in terms of compositions of linear transformations of probability distributions, Markov kernels, that can be expressed by the layers of the network. We investigate different types of shallow and deep architectures, and the minimal number of layers and units per layer that are sufficient and necessary in order for the network to be able to approximate any given stochastic mapping from the set of inputs to the set of outputs arbitrarily well.
Targeted Estimation of Heterogeneous Treatment Effect in Observational Survival Analysis
The aim of clinical effectiveness research using repositories of electronic health records is to identify what health interventions 'work best' in real-world settings. Since there are several reasons why the net benefit of intervention may differ across patients, current comparative effectiveness literature focuses on investigating heterogeneous treatment effect and predicting whether an individual might benefit from an intervention. The majority of this literature has concentrated on the estimation of the effect of treatment on binary outcomes. However, many medical interventions are evaluated in terms of their effect on future events, which are subject to loss to follow-up. In this study, we describe a framework for the estimation of heterogeneous treatment effect in terms of differences in time-to-event (survival) probabilities. We divide the problem into three phases: (1) estimation of treatment effect conditioned on unique sets of the covariate vector; (2) identification of features important for heterogeneity using an ensemble of non-parametric variable importance methods; and (3) estimation of treatment effect on the reference classes defined by the previously selected features, using one-step Targeted Maximum Likelihood Estimation. We conducted a series of simulation studies and found that this method performs well when either sample size or event rate is high enough and the number of covariates contributing to the effect heterogeneity is moderate. An application of this method to a clinical case study was conducted by estimating the effect of oral anticoagulants on newly diagnosed non-valvular atrial fibrillation patients using data from the UK Clinical Practice Research Datalink.
Suicidal Ideation Detection: A Review of Machine Learning Methods and Applications
Ji, Shaoxiong, Pan, Shirui, Li, Xue, Cambria, Erik, Long, Guodong, Huang, Zi
Suicide is a critical issue in the modern society. Early detection and prevention of suicide attempt should be addressed to save people's life. Current suicidal ideation detection methods include clinical methods based on the interaction between social workers or experts and the targeted individuals, and machine learning techniques with feature engineering or deep learning for automatic detection based on online social contents. This is the first survey that comprehensively introduces and discusses the methods from these categories. Domain-specific applications of suicidal ideation detection are also reviewed according to their data sources, i.e., questionnaires, electronic health records, suicide notes, and online user content. To facilitate further research, several specific tasks and datasets are introduced. Finally, we summarize the limitations of current work and provide an outlook of further research directions.
Learning Humanoid Robot Running Skills through Proximal Policy Optimization
Melo, Luckeciano C., Maximo, Marcos R. O. A.
In the current level of evolution of Soccer 3D, motion control is a key factor in team's performance. Recent works takes advantages of model-free approaches based on Machine Learning to exploit robot dynamics in order to obtain faster locomotion skills, achieving running policies and, therefore, opening a new research direction in the Soccer 3D environment. In this work, we present a methodology based on Deep Reinforcement Learning that learns running skills without any prior knowledge, using a neural network whose inputs are related to robot's dynamics. Our results outperformed the previous state-of-the-art sprint velocity reported in Soccer 3D literature by a significant margin. It also demonstrated improvement in sample efficiency, being able to learn how to run in just few hours. We reported our results analyzing the training procedure and also evaluating the policies in terms of speed, reliability and human similarity. Finally, we presented key factors that lead us to improve previous results and shared some ideas for future work.
Robot-Friendly Cities
School of Information Technology, Deakin University, Geelong, Australia Robots are increasingly tested in public spaces, towards a f uture where urban environments are not only for humans but for autonomous syst ems. While robots are promising, for convenience and efficiency, there are challenges associated with building cities crowded with machines. This p aper provides an overview of the problems and some solutions, and calls for gr eater attention on this matter . Urban environments will increasingly be spaces for autonom ous systems, of which automated vehicles is only one popular type. Robot wheelchairs could be used in public as well other robot -transporters to help the elderly.
Artificial Intelligence and the Future of Psychiatry: Qualitative Findings from a Global Physician Survey
Blease, Charlotte, Locher, Cosima, Leon-Carlyle, Marisa, Doraiswamy, P. Murali
The potential for machine learning to disrupt the medical profession is the subject of ongoing debate within biomedical informatics. This study aimed to explore psychiatrists' opinions about the potential impact of innovations in artificial intelligence and machine learning on psychiatric practice. In Spring 2019, we conducted a web-based survey of 791 psychiatrists from 22 countries worldwide. The survey measured opinions about the likelihood future technology would fully replace physicians in performing ten key psychiatric tasks. This study involved qualitative descriptive analysis of written response to three open-ended questions in the survey. Comments were classified into four major categories in relation to the impact of future technology on patient-psychiatric interactions, the quality of patient medical care, the profession of psychiatry, and health systems. Overwhelmingly, psychiatrists were skeptical that technology could fully replace human empathy. Many predicted that 'man and machine' would increasingly collaborate in undertaking clinical decisions, with mixed opinions about the benefits and harms of such an arrangement. Participants were optimistic that technology might improve efficiencies and access to care, and reduce costs. Ethical and regulatory considerations received limited attention. This study presents timely information of psychiatrists' view about the scope of artificial intelligence and machine learning on psychiatric practice. Psychiatrists expressed divergent views about the value and impact of future technology with worrying omissions about practice guidelines, and ethical and regulatory issues.
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Lan, Zhenzhong, Chen, Mingda, Goodman, Sebastian, Gimpel, Kevin, Sharma, Piyush, Soricut, Radu
A BSTRACT Increasing model size when pretraining natural language representations often results in improved performance on downstream tasks. However, at some point further model increases become harder due to GPU/TPU memory limitations, longer training times, and unexpected model degradation. To address these problems, we present two parameter-reduction techniques to lower memory consumption and increase the training speed of BERT (Devlin et al., 2019). Comprehensive empirical evidence shows that our proposed methods lead to models that scale much better compared to the original BERT. We also use a self-supervised loss that focuses on modeling inter-sentence coherence, and show it consistently helps downstream tasks with multi-sentence inputs. As a result, our best model establishes new state-of-the-art results on the GLUE, RACE, and SQuAD benchmarks while having fewer parameters compared to BERT -large. The code and the pretrained models are available at https://github.com/ Many nontrivial NLP tasks, including those that have limited training data, have greatly benefited from these pre-trained models. One of the most compelling signs of these breakthroughs is the evolution of machine performance on a reading comprehension task designed for middle and highschool English exams in China, the RACE test (Lai et al., 2017): the paper that originally describes the task and formulates the modeling challenge reports then state-of-the-art machine accuracy at 44. 1%; the latest published result reports their model performance at 83. 2% (Liu et al., 2019); the work we present here pushes it even higher to 89 .4%, a stunning 45 .3% Evidence from these improvements reveals that a large network is of crucial importance for achieving state-of-the-art performance (Devlin et al., 2019; Radford et al., 2019). It has become common practice to pre-train large models and distill them down to smaller ones (Sun et al., 2019; Turc et al., 2019) for real applications.