Goto

Collaborating Authors

 Instructional Material


AI likely to spell end of traditional school classroom, leading expert says

The Guardian

Recent advances in AI are likely to spell the end of the traditional school classroom, one of the world's leading experts on AI has predicted. Prof Stuart Russell, a British computer scientist based at the University of California, Berkeley, said that personalised ChatGPT-style tutors have the potential to hugely enrich education and widen global access by delivering personalised tuition to every household with a smartphone. The technology could feasibly deliver "most material through to the end of high school", he said. "Education is the biggest benefit that we can look for in the next few years," Russell said before a talk on Friday at the UN's AI for Good Global Summit in Geneva. "It ought to be possible within a few years, maybe by the end of this decade, to be delivering a pretty high quality of education to every child in the world. However, he cautioned that deploying the powerful technology in the education sector also carries risks, including the potential for indoctrination. Russell cited evidence from studies using human tutors that one-to-one teaching can be two to three more times effective than traditional classroom lessons, allowing children to get tailored support and be led by curiosity. "Oxford and Cambridge don't really use a traditional classroom … they use tutors presumably because it's more effective," he said. "It's literally infeasible to do that for every child in the world.


Teach Me How to Learn: A Perspective Review towards User-centered Neuro-symbolic Learning for Robotic Surgical Systems

arXiv.org Artificial Intelligence

Recent advances in machine learning models allowed robots to identify objects on a perceptual nonsymbolic level (e.g., through sensor fusion and natural language understanding). However, these primarily black-box learning models still lack interpretation and transferability and require high data and computational demand. An alternative solution is to teach a robot on both perceptual nonsymbolic and conceptual symbolic levels through hybrid neurosymbolic learning approaches with expert feedback (i.e., human-in-the-loop learning). This work proposes a concept for this user-centered hybrid learning paradigm that focuses on robotic surgical situations. While most recent research focused on hybrid learning for non-robotic and some generic robotic domains, little work focuses on surgical robotics. We survey this related research while focusing on human-in-the-loop surgical robotic systems. This evaluation highlights the most prominent solutions for autonomous surgical robots and the challenges surgeons face when interacting with these systems. Finally, we envision possible ways to address these challenges using online apprenticeship learning based on implicit and explicit feedback from expert surgeons.


What Should Data Science Education Do with Large Language Models?

arXiv.org Artificial Intelligence

The rapid advances of large language models (LLMs), such as ChatGPT, are revolutionizing data science and statistics. These state-of-the-art tools can streamline complex processes. As a result, it reshapes the role of data scientists. We argue that LLMs are transforming the responsibilities of data scientists, shifting their focus from hands-on coding, data-wrangling and conducting standard analyses to assessing and managing analyses performed by these automated AIs. This evolution of roles is reminiscent of the transition from a software engineer to a product manager. We illustrate this transition with concrete data science case studies using LLMs in this paper. These developments necessitate a meaningful evolution in data science education. Pedagogy must now place greater emphasis on cultivating diverse skillsets among students, such as LLM-informed creativity, critical thinking, AI-guided programming. LLMs can also play a significant role in the classroom as interactive teaching and learning tools, contributing to personalized education. This paper discusses the opportunities, resources and open challenges for each of these directions. As with any transformative technology, integrating LLMs into education calls for careful consideration. While LLMs can perform repetitive tasks efficiently, it's crucial to remember that their role is to supplement human intelligence and creativity, not to replace it. Therefore, the new era of data science education should balance the benefits of LLMs while fostering complementary human expertise and innovations. In conclusion, the rise of LLMs heralds a transformative period for data science and its education. This paper seeks to shed light on the emerging trends, potential opportunities, and challenges accompanying this paradigm shift, hoping to spark further discourse and investigation into this exciting, uncharted territory.


Challenges and Opportunities in Offline Reinforcement Learning from Visual Observations

arXiv.org Artificial Intelligence

Offline reinforcement learning has shown great promise in leveraging large pre-collected datasets for policy learning, allowing agents to forgo often-expensive online data collection. However, offline reinforcement learning from visual observations with continuous action spaces remains under-explored, with a limited understanding of the key challenges in this complex domain. In this paper, we establish simple baselines for continuous control in the visual domain and introduce a suite of benchmarking tasks for offline reinforcement learning from visual observations designed to better represent the data distributions present in real-world offline RL problems and guided by a set of desiderata for offline RL from visual observations, including robustness to visual distractions and visually identifiable changes in dynamics. Using this suite of benchmarking tasks, we show that simple modifications to two popular vision-based online reinforcement learning algorithms, DreamerV2 and DrQ-v2, suffice to outperform existing offline RL methods and establish competitive baselines for continuous control in the visual domain. We rigorously evaluate these algorithms and perform an empirical evaluation of the differences between state-of-the-art model-based and model-free offline RL methods for continuous control from visual observations. All code and data used in this evaluation are open-sourced to facilitate progress in this domain.


Continuum Limits of Ollivier's Ricci Curvature on data clouds: pointwise consistency and global lower bounds

arXiv.org Artificial Intelligence

Let $\mathcal{M} \subseteq \mathbb{R}^d$ denote a low-dimensional manifold and let $\mathcal{X}= \{ x_1, \dots, x_n \}$ be a collection of points uniformly sampled from $\mathcal{M}$. We study the relationship between the curvature of a random geometric graph built from $\mathcal{X}$ and the curvature of the manifold $\mathcal{M}$ via continuum limits of Ollivier's discrete Ricci curvature. We prove pointwise, non-asymptotic consistency results and also show that if $\mathcal{M}$ has Ricci curvature bounded from below by a positive constant, then the random geometric graph will inherit this global structural property with high probability. We discuss applications of the global discrete curvature bounds to contraction properties of heat kernels on graphs, as well as implications for manifold learning from data clouds. In particular, we show that the consistency results allow for characterizing the intrinsic curvature of a manifold from extrinsic curvature.


Kernels, Data & Physics

arXiv.org Artificial Intelligence

Lecture notes from the course given by Professor Julia Kempe at the summer school "Statistical physics of Machine Learning" in Les Houches. The notes discuss the so-called NTK approach to problems in machine learning, which consists of gaining an understanding of generally unsolvable problems by finding a tractable kernel formulation. The notes are mainly focused on practical applications such as data distillation and adversarial robustness, examples of inductive bias are also discussed.


Real-time Workload Pattern Analysis for Large-scale Cloud Databases

arXiv.org Artificial Intelligence

Hosting database services on cloud systems has become a common practice. This has led to the increasing volume of database workloads, which provides the opportunity for pattern analysis. Discovering workload patterns from a business logic perspective is conducive to better understanding the trends and characteristics of the database system. However, existing workload pattern discovery systems are not suitable for large-scale cloud databases which are commonly employed by the industry. This is because the workload patterns of large-scale cloud databases are generally far more complicated than those of ordinary databases. In this paper, we propose Alibaba Workload Miner (AWM), a real-time system for discovering workload patterns in complicated large-scale workloads. AWM encodes and discovers the SQL query patterns logged from user requests and optimizes the querying processing based on the discovered patterns. First, Data Collection & Preprocessing Module collects streaming query logs and encodes them into high-dimensional feature embeddings with rich semantic contexts and execution features. Next, Online Workload Mining Module separates encoded queries by business groups and discovers the workload patterns for each group. Meanwhile, Offline Training Module collects labels and trains the classification model using the labels. Finally, Pattern-based Optimizing Module optimizes query processing in cloud databases by exploiting discovered patterns. Extensive experimental results on one synthetic dataset and two real-life datasets (extracted from Alibaba Cloud databases) show that AWM enhances the accuracy of pattern discovery by 66% and reduce the latency of online inference by 22%, compared with the state-of-the-arts.


Training Energy-Based Models with Diffusion Contrastive Divergences

arXiv.org Artificial Intelligence

Energy-Based Models (EBMs) have been widely used for generative modeling. Contrastive Divergence (CD), a prevailing training objective for EBMs, requires sampling from the EBM with Markov Chain Monte Carlo methods (MCMCs), which leads to an irreconcilable trade-off between the computational burden and the validity of the CD. Running MCMCs till convergence is computationally intensive. On the other hand, short-run MCMC brings in an extra non-negligible parameter gradient term that is difficult to handle. In this paper, we provide a general interpretation of CD, viewing it as a special instance of our proposed Diffusion Contrastive Divergence (DCD) family. By replacing the Langevin dynamic used in CD with other EBM-parameter-free diffusion processes, we propose a more efficient divergence. We show that the proposed DCDs are both more computationally efficient than the CD and are not limited to a non-negligible gradient term. We conduct intensive experiments, including both synthesis data modeling and high-dimensional image denoising and generation, to show the advantages of the proposed DCDs. On the synthetic data learning and image denoising experiments, our proposed DCD outperforms CD by a large margin. In image generation experiments, the proposed DCD is capable of training an energy-based model for generating the Celab-A $32\times 32$ dataset, which is comparable to existing EBMs.


Ivy League university unveils plan to teach students with AI chatbot this fall: 'Evolution' of 'tradition'

FOX News

PactumAI co-founder and CEO Martin Rand explains how workers can use artificial intelligence to enhance their careers and positions. Students at one of the America's most elite universities will be in for a surprise this fall when they discover their flagship coding class is taught with help from an A.I. chatbot in a bend on what Professor David Malan, the course's overseer, defines as an "evolution" of "tradition." Harvard University unleashed plans to incorporate A.I. chatbots to teach the course, venturing deeper into the uncharted territory of artificial intelligence - a territory that has exponentially grown and altered the course of technology in the past several months. Though the idea sounds novel and exciting, Martin Rand, PactumAI co-founder and CEO, warned to be wary of the "dangers." I INTERVIEWED CHATGPT AS IF IT WAS A HUMAN; HERE'S WHAT IT HAD TO SAY THAT GAVE ME CHILLS People walk through the gate on Harvard Yard at the Harvard University campus on June 29, 2023 in Cambridge, Massachusetts.


Improving Online Continual Learning Performance and Stability with Temporal Ensembles

arXiv.org Artificial Intelligence

Neural networks are very effective when trained on large datasets for a large number of iterations. However, when they are trained on non-stationary streams of data and in an online fashion, their performance is reduced (1) by the online setup, which limits the availability of data, (2) due to catastrophic forgetting because of the non-stationary nature of the data. Furthermore, several recent works (Caccia et al., 2022; Lange et al., 2023) showed that replay methods used in continual learning suffer from the stability gap, encountered when evaluating the model continually (rather than only on task boundaries). In this article, we study the effect of model ensembling as a way to improve performance and stability in online continual learning. We notice that naively ensembling models coming from a variety of training tasks increases the performance in online continual learning considerably. Starting from this observation, and drawing inspirations from semi-supervised learning ensembling methods, we use a lightweight temporal ensemble that computes the exponential moving average of the weights (EMA) at test time, and show that it can drastically increase the performance and stability when used in combination with several methods from the literature. Learning neural networks with backpropagation has been proven capable of good generalization properties even when using overparametrized networks (Krizhevsky et al., 2017). However, these good learning properties mainly occur when the data is provided in an independant and identically distributed manner. When learning on a stream which distribution varies over time, neural networks are known to suffer from catastrophic forgetting (McCloskey & Cohen, 1989; Goodfellow et al., 2014; Kirkpatrick et al., 2017), and tend to forget knowledge acquired in previous learning tasks. The field of continual learning aims to address this problem. Generally, incremental learning separates the learning into distinct tasks (identified by a task-ID) that are encountered sequentially by the agent. A variety of settings have been introduced in continual learning in order to evaluate several aspects of the continual learning agent; taskincremental learning (De Lange et al., 2021; van de Ven & Tolias, 2018), and class-incremental learning (Masana et al., 2022; Belouadah et al., 2021) are among the most popular. In this paper, we focus on the more challenging class-incremental setting, where the learner does not have access to the task-ID at inference time.