Instructional Material
List Online Classification
Moran, Shay, Sharon, Ohad, Tsubari, Iska, Yosebashvili, Sivan
We study multiclass online prediction where the learner can predict using a list of multiple labels (as opposed to just one label in the traditional setting). We characterize learnability in this model using the $b$-ary Littlestone dimension. This dimension is a variation of the classical Littlestone dimension with the difference that binary mistake trees are replaced with $(k+1)$-ary mistake trees, where $k$ is the number of labels in the list. In the agnostic setting, we explore different scenarios depending on whether the comparator class consists of single-labeled or multi-labeled functions and its tradeoff with the size of the lists the algorithm uses. We find that it is possible to achieve negative regret in some cases and provide a complete characterization of when this is possible. As part of our work, we adapt classical algorithms such as Littlestone's SOA and Rosenblatt's Perceptron to predict using lists of labels. We also establish combinatorial results for list-learnable classes, including an list online version of the Sauer-Shelah-Perles Lemma. We state our results within the framework of pattern classes -- a generalization of hypothesis classes which can represent adaptive hypotheses (i.e. functions with memory), and model data-dependent assumptions such as linear classification with margin.
Sharing Lifelong Reinforcement Learning Knowledge via Modulating Masks
Nath, Saptarshi, Peridis, Christos, Ben-Iwhiwhu, Eseoghene, Liu, Xinran, Dora, Shirin, Liu, Cong, Kolouri, Soheil, Soltoggio, Andrea
Lifelong learning agents aim to learn multiple tasks sequentially over a lifetime. This involves the ability to exploit previous knowledge when learning new tasks and to avoid forgetting. Modulating masks, a specific type of parameter isolation approach, have recently shown promise in both supervised and reinforcement learning. While lifelong learning algorithms have been investigated mainly within a single-agent approach, a question remains on how multiple agents can share lifelong learning knowledge with each other. We show that the parameter isolation mechanism used by modulating masks is particularly suitable for exchanging knowledge among agents in a distributed and decentralized system of lifelong learners. The key idea is that the isolation of specific task knowledge to specific masks allows agents to transfer only specific knowledge on-demand, resulting in robust and effective distributed lifelong learning. We assume fully distributed and asynchronous scenarios with dynamic agent numbers and connectivity. An on-demand communication protocol ensures agents query their peers for specific masks to be transferred and integrated into their policies when facing each task. Experiments indicate that on-demand mask communication is an effective way to implement distributed lifelong reinforcement learning and provides a lifelong learning benefit with respect to distributed RL baselines such as DD-PPO, IMPALA, and PPO+EWC. The system is particularly robust to connection drops and demonstrates rapid learning due to knowledge exchange.
A Study on Transformer Configuration and Training Objective
Xue, Fuzhao, Chen, Jianghai, Sun, Aixin, Ren, Xiaozhe, Zheng, Zangwei, He, Xiaoxin, Chen, Yongming, Jiang, Xin, You, Yang
Transformer-based models have delivered impressive results on many tasks, particularly vision and language tasks. In many model training situations, conventional configurations are typically adopted. For example, we often set the base model with hidden dimensions (i.e. model width) to be 768 and the number of transformer layers (i.e. model depth) to be 12. In this paper, we revisit these conventional configurations. Through theoretical analysis and experimental evaluation, we show that the masked autoencoder is effective in alleviating the over-smoothing issue in deep transformer training. Based on this finding, we propose Bamboo, an idea of using deeper and narrower transformer configurations, for masked autoencoder training. On ImageNet, with such a simple change in configuration, re-designed model achieves 87.1% top-1 accuracy and outperforms SoTA models like MAE and BEiT. On language tasks, re-designed model outperforms BERT with default setting by 1.1 points on average, on GLUE datasets.
Project-Based Learning for Robot Control Theory: A Robot Operating System (ROS) Based Approach
Control theory is an important cornerstone of the robotics field and is considered a fundamental subject in an undergraduate and postgraduate robotics curriculum. Furthermore, project-based learning has shown significant benefits in engineering domains, specifically in interdisciplinary fields such as robotics which require hands-on experience to master the discipline adequately. However, designing a project-based learning experience to teach control theory in a hands-on setting can be challenging, due to the rigor of mathematical concepts involved in the subject. Moreover, access to reliable hardware required for a robotics control lab, including the robots, sensors, interfaces, and measurement instruments, may not be feasible in developing countries and even many academic institutions in the US. The current paper presents a set of six project-based assignments for an advanced postgraduate Robot Control course. The assignments leverage the Robot Operating System (ROS), an open-source set of tools, libraries, and software, which is a de facto standard for the development of robotics applications. The use of ROS, along with its physics engine simulation framework, Gazebo, provides a hands-on robotics experience equivalent to working with real hardware. Learning outcomes include: i) theoretical analysis of linear and nonlinear dynamical systems, ii) formulation and implementation of advanced model-based robot control algorithms using classical and modern control theory, and iii) programming and performance evaluation of robotic systems on physics engine robot simulators. Course evaluations and student surveys demonstrate that the proposed project-based assignments successfully bridge the gap between theory and practice, and facilitate learning of control theory concepts and state-of-the-art robotics techniques through a hands-on approach.
The First Year of AI College Ends in Ruin
That's what the software concluded about a student's paper. One of the professors in the academic program I direct had come across this finding and asked me what to do with it. Then another one saw the same result--100 percent AI--for a different paper by that student, and also wondered: What does this mean? The problem breaks down into more problems: whether it's possible to know for certain that a student used AI, what it even means to "use" AI for writing papers, and when that use amounts to cheating. The software that had flagged our student's papers was also multilayered: Canvas, our courseware system, was running Turnitin, a popular plagiarism-detection service, which had recently installed a new AI-detection algorithm.
NEVIS'22: A Stream of 100 Tasks Sampled from 30 Years of Computer Vision Research
Bornschein, Jorg, Galashov, Alexandre, Hemsley, Ross, Rannen-Triki, Amal, Chen, Yutian, Chaudhry, Arslan, He, Xu Owen, Douillard, Arthur, Caccia, Massimo, Feng, Qixuang, Shen, Jiajun, Rebuffi, Sylvestre-Alvise, Stacpoole, Kitty, Casas, Diego de las, Hawkins, Will, Lazaridou, Angeliki, Teh, Yee Whye, Rusu, Andrei A., Pascanu, Razvan, Ranzato, Marc'Aurelio
A shared goal of several machine learning communities like continual learning, meta-learning and transfer learning, is to design algorithms and models that efficiently and robustly adapt to unseen tasks. An even more ambitious goal is to build models that never stop adapting, and that become increasingly more efficient through time by suitably transferring the accrued knowledge. Beyond the study of the actual learning algorithm and model architecture, there are several hurdles towards our quest to build such models, such as the choice of learning protocol, metric of success and data needed to validate research hypotheses. In this work, we introduce the Never-Ending VIsual-classification Stream (NEVIS'22), a benchmark consisting of a stream of over 100 visual classification tasks, sorted chronologically and extracted from papers sampled uniformly from computer vision proceedings spanning the last three decades. The resulting stream reflects what the research community thought was meaningful at any point in time, and it serves as an ideal test bed to assess how well models can adapt to new tasks, and do so better and more efficiently as time goes by. Despite being limited to classification, the resulting stream has a rich diversity of tasks from OCR, to texture analysis, scene recognition, and so forth. The diversity is also reflected in the wide range of dataset sizes, spanning over four orders of magnitude. Overall, NEVIS'22 poses an unprecedented challenge for current sequential learning approaches due to the scale and diversity of tasks, yet with a low entry barrier as it is limited to a single modality and well understood supervised learning problems. Moreover, we provide a reference implementation including strong baselines and an evaluation protocol to compare methods in terms of their trade-off between accuracy and compute.
Accessible Interfaces for the Development and Deployment of Robotic Platforms
Accessibility is one of the most important features in the design of robots and their interfaces. This thesis proposes methods that improve the accessibility of robots for three different target audiences: consumers, researchers, and learners. In order for humans and robots to work together effectively, they both must be able to communicate with each other. We tackle the problem of generating route instructions that are readily understandable by novice humans for the navigation of a priori unknown indoor environments. We then move on to the related problem of enabling robots to understand natural language utterances in the context of learning to operate articulated objects (e.g., fridges, drawers) by leveraging kinematic models. Next, we turn our focus to the development of accessible and reproducible robotic platforms for scientific research. We propose a new concept for reproducible robotics research that integrates development and benchmarking, so that reproducibility is obtained "by design" from the beginning of the research and development process. We then propose a framework called SHARC (SHared Autonomy for Remote Collaboration), to improve accessibility for underwater robotic intervention operations. SHARC allows multiple remote scientists to efficiently plan and execute high-level sampling procedures using an underwater manipulator while deferring low-level control to the robot. Lastly, we developed the first hardware-based MOOC in AI and robotics. This course allows learners to study autonomy hands-on by making real robots make their own decisions and accomplish broadly defined tasks. We design a new robotic platform from the ground up to support this new learning experience. A fully browser-based interface, based on leading tools and technologies for code development, testing, validation, and deployment serves to maximize the accessibility of these educational resources.
Manipulator Differential Kinematics: Part 2: Acceleration and Advanced Applications
This is the second and final article on the tutorial on manipulator differential kinematics. In Part 1, we described a method of modelling kinematics using the elementary transform sequence (ETS), before formulating forward kinematics and the manipulator Jacobian. We then described some basic applications of the manipulator Jacobian including resolved-rate motion control (RRMC), inverse kinematics (IK), and some manipulator performance measures. In this article, we formulate the second-order differential kinematics, leading to a definition of manipulator Hessian. We then describe the differential kinematics' analytical forms, which are essential to dynamics applications. Subsequently, we provide a general formula for higher-order derivatives. The first application we consider is advanced velocity control. In this section, we extend resolved-rate motion control to perform sub-tasks while still achieving the goal before redefining the algorithm as a quadratic program to enable greater flexibility and additional constraints. We then take another look at numerical inverse kinematics with an emphasis on adding constraints. Finally, we analyse how the manipulator Hessian can help to escape singularities. We have provided Jupyter Notebooks to accompany each section within this tutorial. The Notebooks are written in Python code and use the Robotics Toolbox for Python, and the Swift Simulator to provide examples and implementations of algorithms. While not absolutely essential, for the most engaging and informative experience, we recommend working through the Jupyter Notebooks while reading this article. The Notebooks and setup instructions can be accessed at https://github.com/jhavl/dkt.
Manipulator Differential Kinematics: Part 1: Kinematics, Velocity, and Applications
Manipulator kinematics is concerned with the motion of each link within a manipulator without considering mass or force. In this article, which is the first in a two-part tutorial, we provide an introduction to modelling manipulator kinematics using the elementary transform sequence (ETS). Then we formulate the first-order differential kinematics, which leads to the manipulator Jacobian, which is the basis for velocity control and inverse kinematics. We describe essential classical techniques which rely on the manipulator Jacobian before exhibiting some contemporary applications. Part II of this tutorial provides a formulation of second and higher-order differential kinematics, introduces the manipulator Hessian, and illustrates advanced techniques, some of which improve the performance of techniques demonstrated in Part I. We have provided Jupyter Notebooks to accompany each section within this tutorial. The Notebooks are written in Python code and use the Robotics Toolbox for Python, and the Swift Simulator to provide examples and implementations of algorithms. While not absolutely essential, for the most engaging and informative experience, we recommend working through the Jupyter Notebooks while reading this article. The Notebooks and setup instructions can be accessed at https://github.com/jhavl/dkt.
OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research
Ji, Jiaming, Zhou, Jiayi, Zhang, Borong, Dai, Juntao, Pan, Xuehai, Sun, Ruiyang, Huang, Weidong, Geng, Yiran, Liu, Mickel, Yang, Yaodong
AI systems empowered by reinforcement learning (RL) algorithms harbor the immense potential to catalyze societal advancement, yet their deployment is often impeded by significant safety concerns. Particularly in safety-critical applications, researchers have raised concerns about unintended harms or unsafe behaviors of unaligned RL agents. The philosophy of safe reinforcement learning (SafeRL) is to align RL agents with harmless intentions and safe behavioral patterns. In SafeRL, agents learn to develop optimal policies by receiving feedback from the environment, while also fulfilling the requirement of minimizing the risk of unintended harm or unsafe behavior. However, due to the intricate nature of SafeRL algorithm implementation, combining methodologies across various domains presents a formidable challenge. This had led to an absence of a cohesive and efficacious learning framework within the contemporary SafeRL research milieu. In this work, we introduce a foundational framework designed to expedite SafeRL research endeavors. Our comprehensive framework encompasses an array of algorithms spanning different RL domains and places heavy emphasis on safety elements. Our efforts are to make the SafeRL-related research process more streamlined and efficient, therefore facilitating further research in AI safety.