Goto

Collaborating Authors

 critical analysis


Evaluating Reliability in Medical DNNs: A Critical Analysis of Feature and Confidence-Based OOD Detection

arXiv.org Artificial Intelligence

Reliable use of deep neural networks (DNNs) for medical image analysis requires methods to identify inputs that differ significantly from the training data, called out-of-distribution (OOD), to prevent erroneous predictions. OOD detection methods can be categorised as either confidence-based (using the model's output layer for OOD detection) or feature-based (not using the output layer). We created two new OOD benchmarks by dividing the D7P (dermatology) and BreastMNIST (ultrasound) datasets into subsets which either contain or don't contain an artefact (rulers or annotations respectively). Models were trained with artefact-free images, and images with the artefacts were used as OOD test sets. For each OOD image, we created a counterfactual by manually removing the artefact via image processing, to assess the artefact's impact on the model's predictions. We show that OOD artefacts can boost a model's softmax confidence in its predictions, due to correlations in training data among other factors. This contradicts the common assumption that OOD artefacts should lead to more uncertain outputs, an assumption on which most confidence-based methods rely. We use this to explain why feature-based methods (e.g. Mahalanobis score) typically have greater OOD detection performance than confidence-based methods (e.g. MCP). However, we also show that feature-based methods typically perform worse at distinguishing between inputs that lead to correct and incorrect predictions (for both OOD and ID data). Following from these insights, we argue that a combination of feature-based and confidence-based methods should be used within DNN pipelines to mitigate their respective weaknesses. These project's code and OOD benchmarks are available at: https://github.com/HarryAnthony/Evaluating_OOD_detection.


From Image to Language: A Critical Analysis of Visual Question Answering (VQA) Approaches, Challenges, and Opportunities

arXiv.org Artificial Intelligence

The multimodal task of Visual Question Answering (VQA) encompassing elements of Computer Vision (CV) and Natural Language Processing (NLP), aims to generate answers to questions on any visual input. Over time, the scope of VQA has expanded from datasets focusing on an extensive collection of natural images to datasets featuring synthetic images, video, 3D environments, and various other visual inputs. The emergence of large pre-trained networks has shifted the early VQA approaches relying on feature extraction and fusion schemes to vision language pre-training (VLP) techniques. However, there is a lack of comprehensive surveys that encompass both traditional VQA architectures and contemporary VLP-based methods. Furthermore, the VLP challenges in the lens of VQA haven't been thoroughly explored, leaving room for potential open problems to emerge. Our work presents a survey in the domain of VQA that delves into the intricacies of VQA datasets and methods over the field's history, introduces a detailed taxonomy to categorize the facets of VQA, and highlights the recent trends, challenges, and scopes for improvement. We further generalize VQA to multimodal question answering, explore tasks related to VQA, and present a set of open problems for future investigation. The work aims to navigate both beginners and experts by shedding light on the potential avenues of research and expanding the boundaries of the field.


ChatGPT gets better marks than students in some university courses

New Scientist

ChatGPT may be as good as or better than students at assessments in around a quarter of university courses. However, this generally only applies to questions with a clear answer that require memory recall, rather than critical analysis. Yasir Zaki and his team at New York University Abu Dhabi in the United Arab Emirates contacted colleagues in other departments asking them to provide assessment questions from courses taught at the university, including computer science, psychology, political science and business. These colleagues also provided real student answers to the questions. The questions were then run through the artificial intelligence chatbot ChatGPT, which supplied its own responses.


Ethics for social robotics: A critical analysis

arXiv.org Artificial Intelligence

Social robotics development for the practice of care and European prospects to incorporate these AI-based systems in institutional healthcare contexts call for an urgent ethical reflection to (re)configurate our practical life according to human values and rights. Despite the growing attention to the ethical implications of social robotics, the current debate on one of its central branches, social assistive robotics (SAR), rests upon an impoverished ethical approach. This paper presents and examines some tendencies of this prevailing approach, which have been identified as a result of a critical literature review. Based on this analysis of a representative case of how ethical reflection is being led towards social robotics, some future research lines are outlined, which may help reframe and deepen in its ethical implications.


Are GAN generated images easy to detect? A critical analysis of the state-of-the-art

arXiv.org Artificial Intelligence

The advent of deep learning has brought a significant improvement in the quality of generated media. However, with the increased level of photorealism, synthetic media are becoming hardly distinguishable from real ones, raising serious concerns about the spread of fake or manipulated information over the Internet. In this context, it is important to develop automated tools to reliably and timely detect synthetic media. In this work, we analyze the state-of-the-art methods for the detection of synthetic images, highlighting the key ingredients of the most successful approaches, and comparing their performance over existing generative architectures. We will devote special attention to realistic and challenging scenarios, like media uploaded on social networks or generated by new and unseen architectures, analyzing the impact of suitable augmentation and training strategies on the detectors' generalization ability.


Notes for a critical analysis of Machine Learning in education

#artificialintelligence

Artificial Intelligence is without doubt a topic deserving serious consideration and scholarly attention, but its impact on education is still rather unclear. It feels timely to begin a critical conversation, while keeping in mind how marginal AI may still be for the day-to-day business of education. As critical education researchers we find ourselves gazing more and more into the near or even the far future, trying to challenge the narratives of disruption and innovation that inspire actual policies and drive investments in various'cutting-edge' areas of educational technology. While future-gazing may help us understand the trajectory of modern education, it is imperative to remain within clear methodological parameters. Some of my recent work (with Ben Williamson) focused on the so-called'social life' of learning analytics, during which we tried to apply a methodological approach based on three premises: Recent trends in economic sociology were a source of inspiration, in particular critical analyses at the intersection of material semiotics and performativity.