Country
Discovery and Separation of Features for Invariant Representation Learning
Jaiswal, Ayush, Brekelmans, Rob, Moyer, Daniel, Steeg, Greg Ver, AbdAlmageed, Wael, Natarajan, Premkumar
Supervised machine learning models often associate irrelevant nuisance factors with the prediction target, which hurts generalization. We propose a framework for training robust neural networks that induces invariance to nuisances through learning to discover and separate predictive and nuisance factors of data. We present an information theoretic formulation of our approach, from which we derive training objectives and its connections with previous methods. Empirical results on a wide array of datasets show that the proposed framework achieves state-of-the-art performance, without requiring nuisance annotations during training.
ExperienceThinking: Hyperparameter Optimization with Budget Constraints
Wang, Chunnan, Wang, Hongzhi, Zhou, Chang, Chen, Hanxiao, Li, Jianzhong, Gao, Hong
The problem of hyperparameter optimization exists widely in the real life and many common tasks can be transformed into it, such as neural architecture search and feature subset selection. Without considering various constraints, the existing hyperparameter tuning techniques can solve these problems effectively by traversing as many hyperparameter configurations as possible. However, because of the limited resources and budget, it is not feasible to evaluate so many kinds of configurations, which requires us to design effective algorithms to find a best possible hyperparameter configuration with a finite number of configuration evaluations. In this paper, we simulate human thinking processes and combine the merit of the existing techniques, and thus propose a new algorithm called ExperienceThinking, trying to solve this constrained hyperparameter optimization problem. In addition, we analyze the performances of 3 classical hyperparameter optimization algorithms with a finite number of configuration evaluations, and compare with that of ExperienceThinking. The experimental results show that our proposed algorithm provides superior results and has better performance.
Combining MixMatch and Active Learning for Better Accuracy with Fewer Labels
Song, Shuang, Berthelot, David, Rostamizadeh, Afshin
We propose using active learning based techniques to further improve the state-of-the-art semi-supervised learning MixMatch algorithm. We provide a thorough empirical evaluation of several active-learning and baseline methods, which successfully demonstrate a significant improvement on the benchmark CIFAR-10, CIFAR-100, and SVHN datasets (as much as 1.5% in absolute accuracy). We also provide an empirical analysis of the cost trade-off between incrementally gathering more labeled versus unlabeled data. This analysis can be used to measure the relative value of labeled/unlabeled data at different points of the learning curve, where we find that although the incremental value of labeled data can be as much as 20x that of unlabeled, it quickly diminishes to less than 3x once more than 2,000 labeled example are observed. Code can be found at https://github.com/google-research/mma.
Measuring the intelligence of an idealized mechanical knowing agent
We define a notion of the intelligence level of an idealized mechanical knowing agent. This is motivated by efforts within artificial intelligence research to define real-number intelligence levels of complicated intelligent systems. Our agents are more idealized, which allows us to define a much simpler measure of intelligence level for them. In short, we define the intelligence level of a mechanical knowing agent to be the supremum of the computable ordinals that have codes the agent knows to be codes of computable ordinals. We prove that if one agent knows certain things about another agent, then the former necessarily has a higher intelligence level than the latter. This allows our intelligence notion to serve as a stepping stone to obtain results which, by themselves, are not stated in terms of our intelligence notion (results of potential interest even to readers totally skeptical that our notion correctly captures intelligence). As an application, we argue that these results comprise evidence against the possibility of intelligence explosion (that is, the notion that sufficiently intelligent machines will eventually be capable of designing even more intelligent machines, which can then design even more intelligent machines, and so on).
Computa\c{c}\~ao Urbana da Teoria \`a Pr\'atica: Fundamentos, Aplica\c{c}\~oes e Desafios
Rodrigues, Diego O., Santos, Frances A., Filho, Geraldo P. Rocha, Akabane, Ademar T., Cabral, Raquel, Immich, Roger, Junior, Wellington L., Cunha, Felipe D., Guidoni, Daniel L., Silva, Thiago H., Rosário, Denis, Cerqueira, Eduardo, Loureiro, Antonio A. F., Villas, Leandro A.
The growing of cities has resulted in innumerable technical and managerial challenges for public administrators such as energy consumption, pollution, urban mobility and even supervision of private and public spaces in an appropriate way. Urban Computing emerges as a promising paradigm to solve such challenges, through the extraction of knowledge, from a large amount of heterogeneous data existing in urban space. Moreover, Urban Computing correlates urban sensing, data management, and analysis to provide services that have the potential to improve the quality of life of the citizens of large urban centers. Consider this context, this chapter aims to present the fundamentals of Urban Computing and the steps necessary to develop an application in this area. To achieve this goal, the following questions will be investigated, namely: (i) What are the main research problems of Urban Computing?; (ii) What are the technological challenges for the implementation of services in Urban Computing?; (iii) What are the main methodologies used for the development of services in Urban Computing?; and (iv) What are the representative applications in this field?
Learning Agent Communication under Limited Bandwidth by Message Pruning
Mao, Hangyu, Zhang, Zhengchao, Xiao, Zhen, Gong, Zhibo, Ni, Yan
Communication is a crucial factor for the big multi-agent world to stay organized and productive. Recently, Deep Reinforcement Learning (DRL) has been applied to learn the communication strategy and the control policy for multiple agents. However, the practical \emph{\textbf{limited bandwidth}} in multi-agent communication has been largely ignored by the existing DRL methods. Specifically, many methods keep sending messages incessantly, which consumes too much bandwidth. As a result, they are inapplicable to multi-agent systems with limited bandwidth. To handle this problem, we propose a gating mechanism to adaptively prune less beneficial messages. We evaluate the gating mechanism on several tasks. Experiments demonstrate that it can prune a lot of messages with little impact on performance. In fact, the performance may be greatly improved by pruning redundant messages. Moreover, the proposed gating mechanism is applicable to several previous methods, equipping them the ability to address bandwidth restricted settings.
Patchy Image Structure Classification Using Multi-Orientation Region Transform
Yu, Xiaohan, Zhao, Yang, Gao, Yongsheng, Xiong, Shengwu, Yuan, Xiaohui
Exterior contour and interior structure are both vital features for classifying objects. However, most of the existing methods consider exterior contour feature and internal structure feature separately, and thus fail to function when classifying patchy image structures that have similar contours and flexible structures. To address above limitations, this paper proposes a novel Multi-Orientation Region Transform (MORT), which can effectively characterize both contour and structure features simultaneously, for patchy image structure classification. MORT is performed over multiple orientation regions at multiple scales to effectively integrate patchy features, and thus enables a better description of the shape in a coarse-to-fine manner. Moreover, the proposed MORT can be extended to combine with the deep convolutional neural network techniques, for further enhancement of classification accuracy. V ery encouraging experimental results on the challenging ultra-fine-grained cultivar recognition task, insect wing recognition task, and large variation butterfly recognition task are obtained, which demonstrate the effectiveness and superiority of the proposed MORT over the state-of-the-art methods in classifying patchy image structures. Our code and three patchy image structure datasets are available at: https://github.com/XiaohanY
Influence Maximization for Social Good: Use of Social Networks in Low Resource Communities
This thesis proposal makes the following technical contributions: (i) we provide a definition of the Dynamic Influence Maximization Under Uncertainty (or DIME) problem, which models the problem faced by homeless shelters accurately; (ii) we propose a novel Partially Observable Markov Decision Process (POMDP) model for solving the DIME problem; (iii) we design two scalable POMDP algorithms (PSINET and HEALER) for solving the DIME problem, since conventional POMDP solvers fail to scale up to sizes of interest; and (iv) we test our algorithms effectiveness in the real world by conducting a pilot study with actual homeless youth in Los Angeles. The success of this pilot (as explained later) shows the promise of using influence maximization for social good on a larger scale.
Artificial Intelligence for Low-Resource Communities: Influence Maximization in an Uncertain World
The potential of Artificial Intelligence (AI) to tackle challenging problems that afflict society is enormous, particularly in the areas of healthcare, conservation and public safety and security. Many problems in these domains involve harnessing social networks of under-served communities to enable positive change, e.g., using social networks of homeless youth to raise awareness about Human Immunodeficiency Virus (HIV) and other STDs. Unfortunately, most of these real-world problems are characterized by uncertainties about social network structure and influence models, and previous research in AI fails to sufficiently address these uncertainties. This thesis addresses these shortcomings by advancing the state-of-the-art to a new generation of algorithms for interventions in social networks. In particular, this thesis describes the design and development of new influence maximization algorithms which can handle various uncertainties that commonly exist in real-world social networks. These algorithms utilize techniques from sequential planning problems and social network theory to develop new kinds of AI algorithms. Further, this thesis also demonstrates the real-world impact of these algorithms by describing their deployment in three pilot studies to spread awareness about HIV among actual homeless youth in Los Angeles. This represents one of the first-ever deployments of computer science based influence maximization algorithms in this domain. Our results show that our AI algorithms improved upon the state-of-the-art by 160% in the real-world. We discuss research and implementation challenges faced in deploying these algorithms, and lessons that can be gleaned for future deployment of such algorithms. The positive results from these deployments illustrate the enormous potential of AI in addressing societally relevant problems.
Human-Robot Collaboration via Deep Reinforcement Learning of Real-World Interactions
Tjomsland, Jonas, Shafti, Ali, Faisal, A. Aldo
Human-Robot Collaboration via Deep Reinforcement Learning of Real-World Interactions Jonas Tjomsland 1, Ali Shafti 1,2,3, A. Aldo Faisal 1,2,3,4 1 Dept. of Bioengineering, 2 Dept. of Computing, 3 Data Science Institute, 4 UKRI CDT for AI in Healthcare, Imperial College London jt732@cam.ac.uk, a.shafti@imperial.ac.uk, aldo.faisal@imperial.ac.uk Abstract W e present a robotic setup for real-world testing and evaluation of human-robot and human-human collaborative learning. Leveraging the sample-efficiency of the Soft Actor-Critic algorithm, we have implemented a robotic platform able to learn a nontrivial collaborative task with a human partner, without pre-training in simulation, and using only 30 minutes of real-world interactions. This enables us to study Human-Robot and Human-Human collaborative learning through real-world interactions. W e present preliminary results, showing that state-of-the-art deep learning methods can take human-robot collaborative learning a step closer to that of humans interacting with each other . 1 Introduction Artificially intelligent agents are displaying impressive behaviour in diverse individual tasks, such as skin cancer classification [1] and complex board games [2]. Similarly, multi-agent environments, where a degree of teamwork is required, are being explored [3].