Overview
Nvidia GeForce RTX 3080 Founders Edition review: Staggeringly powerful
Nvidia's GeForce RTX 3080 graphics card symbolizes why we tell people to wait for the second generation when bleeding-edge technology appears. The radical new-look Turing GPUs inside Nvidia's GeForce RTX 20-series packed all sorts of cutting-edge technologies designed to usher in real-time ray tracing, a long sought-after goal for the gaming industry. Not only did Turing introduce specialized RT cores devoted to processing ray tracing tasks, it also debuted tensor cores, dedicated hardware that uses machine learning to help denoise ray traced visuals and enable AI-enhanced tools like the fantastic Deep Learning Super Sampling (DLSS) technology. Turing's improvements also extended to the traditional shader cores, introducing an overhauled processing pipeline better equipped to handle games built using the newer DirectX 12 and Vulkan graphics APIs. All of these were huge departures from the norm.
Transfer Learning in Deep Reinforcement Learning: A Survey
Zhu, Zhuangdi, Lin, Kaixiang, Zhou, Jiayu
This paper surveys the field of transfer learning in the problem setting of Reinforcement Learning (RL). RL has been the key solution to sequential decision-making problems. Along with the fast advance of RL in various domains. including robotics and game-playing, transfer learning arises as an important technique to assist RL by leveraging and transferring external expertise to boost the learning process. In this survey, we review the central issues of transfer learning in the RL domain, providing a systematic categorization of its state-of-the-art techniques. We analyze their goals, methodologies, applications, and the RL frameworks under which these transfer learning techniques would be approachable. We discuss the relationship between transfer learning and other relevant topics from an RL perspective and also explore the potential challenges as well as future development directions for transfer learning in RL.
Partial Bandit and Semi-Bandit: Making the Most Out of Scarce Users' Feedback
Letard, Alexandre, Amghar, Tassadit, Camp, Olivier, Gutowski, Nicolas
Recent works on Multi-Armed Bandits (MAB) and Combinatorial Multi-Armed Bandits (COM-MAB) show good results on a global accuracy metric. This can be achieved, in the case of recommender systems, with personalization. However, with a combinatorial online learning approach, personalization implies a large amount of user feedbacks. Such feedbacks can be hard to acquire when users need to be directly and frequently solicited. For a number of fields of activities undergoing the digitization of their business, online learning is unavoidable. Thus, a number of approaches allowing implicit user feedback retrieval have been implemented. Nevertheless, this implicit feedback can be misleading or inefficient for the agent's learning. Herein, we propose a novel approach reducing the number of explicit feedbacks required by Combinatorial Multi Armed bandit (COM-MAB) algorithms while providing similar levels of global accuracy and learning efficiency to classical competitive methods. In this paper we present a novel approach for considering user feedback and evaluate it using three distinct strategies. Despite a limited number of feedbacks returned by users (as low as 20% of the total), our approach obtains similar results to those of state of the art approaches.
Efficient Transformers: A Survey
Tay, Yi, Dehghani, Mostafa, Bahri, Dara, Metzler, Donald
Transformer model architectures have garnered immense interest lately due to their effectiveness across a range of domains like language, vision and reinforcement learning. In the field of natural language processing for example, Transformers have become an indispensable staple in the modern deep learning stack. Recently, a dizzying number of "X-former" models have been proposed - Reformer, Linformer, Performer, Longformer, to name a few - which improve upon the original Transformer architecture, many of which make improvements around computational and memory efficiency. With the aim of helping the avid researcher navigate this flurry, this paper characterizes a large and thoughtful selection of recent efficiency-flavored "X-former" models, providing an organized and comprehensive overview of existing work and models across multiple domains.
A Survey of Knowledge-based Sequential Decision Making under Uncertainty
Zhang, Shiqi, Sridharan, Mohan
Reasoning with declarative knowledge (RDK) and sequential decision-making (SDM) are two key research areas in artificial intelligence. RDK methods reason with declarative domain knowledge, including commonsense knowledge, that is either provided a priori or acquired over time, while SDM methods (probabilistic planning and reinforcement learning) seek to compute action policies that maximize the expected cumulative utility over a time horizon; both classes of methods reason in the presence of uncertainty. Despite the rich literature in these two areas, researchers have not fully explored their complementary strengths. In this paper, we survey algorithms that leverage RDK methods while making sequential decisions under uncertainty. We discuss significant developments, open problems, and directions for future work.
Kaggle forecasting competitions: An overlooked learning opportunity
Bojer, Casper Solheim, Meldgaard, Jens Peder
Competitions play an invaluable role in the field of forecasting, as exemplified through the recent M4 competition. The competition received attention from both academics and practitioners and sparked discussions around the representativeness of the data for business forecasting. Several competitions featuring real-life business forecasting tasks on the Kaggle platform has, however, been largely ignored by the academic community. We believe the learnings from these competitions have much to offer to the forecasting community and provide a review of the results from six Kaggle competitions. We find that most of the Kaggle datasets are characterized by higher intermittence and entropy than the M-competitions and that global ensemble models tend to outperform local single models. Furthermore, we find the strong performance of gradient boosted decision trees, increasing success of neural networks for forecasting, and a variety of techniques for adapting machine learning models to the forecasting task.
Reformer, Longformer, and ELECTRA: Key Updates To Transformer Architecture In 2020
The leading pre-trained language models demonstrate remarkable performance on different NLP tasks, making them a much-welcomed tool for a number of applications, including sentiment analysis, chatbots, text summarization, and so on. However, good performance usually comes at the cost of enormous computational resources that are not accessible by most researchers and business practitioners. To address this issue, different research groups are working on increasing the compute-efficiency and parameter-efficiency of the pre-trained language models without sacrificing their accuracy. Among the novel approaches introduced this year, at least three methods are appraised by the AI community as very promising. To help you stay aware of the latest NLP research advancements, we have summarized the corresponding research papers in an easy-to-read bullet-point format.
Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods
The interest in Artificial Intelligence (AI) and its applications has seen unprecedented growth in the last few years. This success can be partly attributed to the advancements made in the sub-fields of AI such as Machine Learning (ML), Computer Vision (CV), and Natural Language Processing (NLP). The largest of the growths in these fields has been made possible with deep learning, a sub-area of machine learning, which uses the principles of artificial neural networks. This has created significant interest in the integration of vision and language. The tasks are designed such that they perfectly embrace the ideas of deep learning. In this survey, we focus on ten prominent tasks that integrate language and vision by discussing their problem formulations, methods, existing datasets, evaluation measures, and compare the results obtained with corresponding state-of-the-art methods. Our efforts go beyond earlier surveys which are either task-specific or concentrate only on one type of visual content, i.e., image or video. Furthermore, we also provide some potential future directions in this field of research with an anticipation that this survey brings in innovative thoughts and ideas to address the existing challenges and build new applications.
A Visual Analytics Framework for Explaining and Diagnosing Transfer Learning Processes
Ma, Yuxin, Fan, Arlen, He, Jingrui, Nelakurthi, Arun Reddy, Maciejewski, Ross
Many statistical learning models hold an assumption that the training data and the future unlabeled data are drawn from the same distribution. However, this assumption is difficult to fulfill in real-world scenarios and creates barriers in reusing existing labels from similar application domains. Transfer Learning is intended to relax this assumption by modeling relationships between domains, and is often applied in deep learning applications to reduce the demand for labeled data and training time. Despite recent advances in exploring deep learning models with visual analytics tools, little work has explored the issue of explaining and diagnosing the knowledge transfer process between deep learning models. In this paper, we present a visual analytics framework for the multi-level exploration of the transfer learning processes when training deep neural networks. Our framework establishes a multi-aspect design to explain how the learned knowledge from the existing model is transferred into the new learning task when training deep neural networks. Based on a comprehensive requirement and task analysis, we employ descriptive visualization with performance measures and detailed inspections of model behaviors from the statistical, instance, feature, and model structure levels. We demonstrate our framework through two case studies on image classification by fine-tuning AlexNets to illustrate how analysts can utilize our framework.
Meta-Learning for Anomaly Classification with Set Equivariant Networks: Application in the Milky Way
Oladosu, Ademola, Xu, Tony, Ekfeldt, Philip, Kelly, Brian A., Cranmer, Miles, Ho, Shirley, Price-Whelan, Adrian M., Contardo, Gabriella
We present a new meta-learning approach for supervised anomaly classification / one-class classification using set equivariant networks. We focus our experiments on an astronomy application. Our problem setting is composed of a set of classification tasks. Each task has a (small) set of positive, labeled examples and a larger set of unlabeled examples. We expect the positive instances to be much more uncommon (i.e. 'anomalies') than the negative ones ('normal' class). We propose a novel use of equivariant networks for this setting. Specifically we use Deep Sets, which was developed for point-clouds and unordered sets and is equivariant to permutation. We propose to consider the set of positive examples of a given task as a 'point-cloud'. The key idea is that the network directly takes as input the set of positive examples in addition to the current example to classify. This allows the model to predict at test-time on new tasks using only positive labeled examples (i.e 'One-Class classification' setting) by design, potentially without retraining. However, the model is trained in a meta-learning regime on a dataset of several tasks with full-supervision (positive and negative labels). This setup is motivated by our target application on stellar streams. Streams are groups of stars sharing specific properties in various features. For a detected stream, we can determine a set of stars that likely belong to the stream. We aim to characterize the membership of all other nearby stars. We build a meta-dataset of simulated streams injected onto real data and evaluate on unseen synthetic streams and one known stream. Our experiments show encouraging results to explore furthermore equivariant networks for anomaly or 'one-class' classification in a meta-learning regime.