Goto

Collaborating Authors

 Instructional Material


ChatGPT gets better marks than students in some university courses

New Scientist - News

ChatGPT may be as good as or better than students at assessments in around a quarter of university courses. However, this generally only applies to questions with a clear answer that require memory recall, rather than critical analysis. Yasir Zaki and his team at New York University Abu Dhabi in the United Arab Emirates contacted colleagues in other departments asking them to provide assessment questions from courses taught at the university, including computer science, psychology, political science and business. These colleagues also provided real student answers to the questions. The questions were then run through the artificial intelligence chatbot ChatGPT, which supplied its own responses.


ChatGPT gets better marks than students in some university courses

New Scientist

ChatGPT may be as good as or better than students at assessments in around a quarter of university courses. However, this generally only applies to questions with a clear answer that require memory recall, rather than critical analysis. Yasir Zaki and his team at New York University Abu Dhabi in the United Arab Emirates contacted colleagues in other departments asking them to provide assessment questions from courses taught at the university, including computer science, psychology, political science and business. These colleagues also provided real student answers to the questions. The questions were then run through the artificial intelligence chatbot ChatGPT, which supplied its own responses.


A Multilayer Framework for Online Metric Learning

arXiv.org Artificial Intelligence

Online metric learning has been widely applied in classification and retrieval. It can automatically learn a suitable metric from data by restricting similar instances to be separated from dissimilar instances with a given margin. However, the existing online metric learning algorithms have limited performance in real-world classifications, especially when data distributions are complex. To this end, this paper proposes a multilayer framework for online metric learning to capture the nonlinear similarities among instances. Different from the traditional online metric learning, which can only learn one metric space, the proposed Multi-Layer Online Metric Learning (MLOML) takes an online metric learning algorithm as a metric layer and learns multiple hierarchical metric spaces, where each metric layer follows a nonlinear layers for the complicated data distribution. Moreover, the forward propagation (FP) strategy and backward propagation (BP) strategy are employed to train the hierarchical metric layers. To build a metric layer of the proposed MLOML, a new Mahalanobis-based Online Metric Learning (MOML) algorithm is presented based on the passive-aggressive strategy and one-pass triplet construction strategy. Furthermore, in a progressively and nonlinearly learning way, MLOML has a stronger learning ability than traditional online metric learning in the case of limited available training data. To make the learning process more explainable and theoretically guaranteed, theoretical analysis is provided. The proposed MLOML enjoys several nice properties, indeed learns a metric progressively, and performs better on the benchmark datasets. Extensive experiments with different settings have been conducted to verify these properties of the proposed MLOML.


Educators have said using ChatGPT is cheating, but now they are using AI to write syllabi and exams: Professor

FOX News

ChatGPT has proven it can help students with their homework, but now it is helping teachers create those very courses, a computer science professor told Fox News. As educators debate whether students should be allowed to use artificial intelligence for assignments, one professor told Fox News that teachers themselves are using the tech to help with their lessons. "I know faculty who are using ChatGPT to help write syllabi and to write exams," a University of California, Berkeley professor of computer science, Hany Farid, told Fox News. "I've seen professors using it to help design courses, write exam problems, write homework problems." "It is both an enabling and a potentially problematic technology," he continued.


Concept Bottleneck with Visual Concept Filtering for Explainable Medical Image Classification

arXiv.org Artificial Intelligence

Interpretability is a crucial factor in building reliable models for various medical applications. Concept Bottleneck Models (CBMs) enable interpretable image classification by utilizing human-understandable concepts as intermediate targets. Unlike conventional methods that require extensive human labor to construct the concept set, recent works leveraging Large Language Models (LLMs) for generating concepts made automatic concept generation possible. However, those methods do not consider whether a concept is visually relevant or not, which is an important factor in computing meaningful concept scores. Therefore, we propose a visual activation score that measures whether the concept contains visual cues or not, which can be easily computed with unlabeled image data. Computed visual activation scores are then used to filter out the less visible concepts, thus resulting in a final concept set with visually meaningful concepts. Our experimental results show that adopting the proposed visual activation score for concept filtering consistently boosts performance compared to the baseline. Moreover, qualitative analyses also validate that visually relevant concepts are successfully selected with the visual activation score.


Tensor Regression

arXiv.org Artificial Intelligence

Regression analysis is a key area of interest in the field of data analysis and machine learning which is devoted to exploring the dependencies between variables, often using vectors. The emergence of high dimensional data in technologies such as neuroimaging, computer vision, climatology and social networks, has brought challenges to traditional data representation methods. Tensors, as high dimensional extensions of vectors, are considered as natural representations of high dimensional data. In this book, the authors provide a systematic study and analysis of tensor-based regression models and their applications in recent years. It groups and illustrates the existing tensor-based regression methods and covers the basics, core ideas, and theoretical characteristics of most tensor-based regression methods. In addition, readers can learn how to use existing tensor-based regression methods to solve specific regression tasks with multiway data, what datasets can be selected, and what software packages are available to start related work as soon as possible. Tensor Regression is the first thorough overview of the fundamentals, motivations, popular algorithms, strategies for efficient implementation, related applications, available datasets, and software resources for tensor-based regression analysis. It is essential reading for all students, researchers and practitioners of working on high dimensional data.


Understanding Silent Failures in Medical Image Classification

arXiv.org Artificial Intelligence

To ensure the reliable use of classification systems in medical applications, it is crucial to prevent silent failures. This can be achieved by either designing classifiers that are robust enough to avoid failures in the first place, or by detecting remaining failures using confidence scoring functions (CSFs). A predominant source of failures in image classification is distribution shifts between training data and deployment data. To understand the current state of silent failure prevention in medical imaging, we conduct the first comprehensive analysis comparing various CSFs in four biomedical tasks and a diverse range of distribution shifts. Based on the result that none of the benchmarked CSFs can reliably prevent silent failures, we conclude that a deeper understanding of the root causes of failures in the data is required. To facilitate this, we introduce SF-Visuals, an interactive analysis tool that uses latent space clustering to visualize shifts and failures. On the basis of various examples, we demonstrate how this tool can help researchers gain insight into the requirements for safe application of classification systems in the medical domain.


Combining Automatic Coding and Instructor Input to Generate ENA Visualizations for Asynchronous Online Discussion

arXiv.org Artificial Intelligence

Asynchronous online discussions are a common fundamental tools to facilitate social interaction in hybrid and online courses. However, instructors lack the tools to accomplish the overwhelming task of evaluating asynchronous online discussion activities. In this paper we present an approach that uses Latent Dirichlet Analysis (LDA) and the instructor's keywords to automatically extract codes from a relatively small dataset. We use the generated codes to build an Epistemic Network Analysis (ENA) model and compare this model with a previous ENA model built by human coders. The results show that there is no statistical difference between the two models. We present an analysis of these models and discuss the potential use of ENA as a visualization to help instructors evaluating asynchronous online discussions.


Learning to generate and corr- uh I mean repair language in real-time

arXiv.org Artificial Intelligence

In conversation, speakers produce language incrementally, word by word, while continuously monitoring the appropriateness of their own contribution in the dynamically unfolding context of the conversation; and this often leads them to repair their own utterance on the fly. This real-time language processing capacity is furthermore crucial to the development of fluent and natural conversational AI. In this paper, we use a previously learned Dynamic Syntax grammar and the CHILDES corpus to develop, train and evaluate a probabilistic model for incremental generation where input to the model is a purely semantic generation goal concept in Type Theory with Records (TTR). We show that the model's output exactly matches the gold candidate in 78% of cases with a ROUGE-l score of 0.86. We further do a zero-shot evaluation of the ability of the same model to generate self-repairs when the generation goal changes mid-utterance. Automatic evaluation shows that the model can generate self-repairs correctly in 85% of cases. A small human evaluation confirms the naturalness and grammaticality of the generated self-repairs. Overall, these results further highlight the generalisation power of grammar-based models and lay the foundations for more controllable, and naturally interactive conversational AI systems.


Stability of Q-Learning Through Design and Optimism

arXiv.org Artificial Intelligence

Q-learning has become an important part of the reinforcement learning toolkit since its introduction in the dissertation of Chris Watkins in the 1980s. The purpose of this paper is in part a tutorial on stochastic approximation and Q-learning, providing details regarding the INFORMS APS inaugural Applied Probability Trust Plenary Lecture, presented in Nancy France, June 2023. The paper also presents new approaches to ensure stability and potentially accelerated convergence for these algorithms, and stochastic approximation in other settings. Two contributions are entirely new: 1. Stability of Q-learning with linear function approximation has been an open topic for research for over three decades. It is shown that with appropriate optimistic training in the form of a modified Gibbs policy, there exists a solution to the projected Bellman equation, and the algorithm is stable (in terms of bounded parameter estimates). Convergence remains one of many open topics for research. 2. The new Zap Zero algorithm is designed to approximate the Newton-Raphson flow without matrix inversion. It is stable and convergent under mild assumptions on the mean flow vector field for the algorithm, and compatible statistical assumption on an underlying Markov chain. The algorithm is a general approach to stochastic approximation which in particular applies to Q-learning with "oblivious" training even with non-linear function approximation.