Goto

Collaborating Authors

 SPE


Socially Responsible AI Algorithms: Issues, Purposes, and Challenges

Journal of Artificial Intelligence Research

In the current era, people and society have grown increasingly reliant on artificial intelligence (AI) technologies. AI has the potential to drive us towards a future in which all of humanity flourishes. It also comes with substantial risks for oppression and calamity. Discussions about whether we should (re)trust AI have repeatedly emerged in recent years and in many quarters, including industry, academia, healthcare, services, and so on. Technologists and AI researchers have a responsibility to develop trustworthy AI systems. They have responded with great effort to design more responsible AI algorithms. However, existing technical solutions are narrow in scope and have been primarily directed towards algorithms for scoring or classification tasks, with an emphasis on fairness and unwanted bias. To build long-lasting trust between AI and human beings, we argue that the key is to think beyond algorithmic fairness and connect major aspects of AI that potentially cause AI’s indifferent behavior. In this survey, we provide a systematic framework of Socially Responsible AI Algorithms that aims to examine the subjects of AI indifference and the need for socially responsible AI algorithms, define the objectives, and introduce the means by which we may achieve these objectives. We further discuss how to leverage this framework to improve societal well-being through protection, information, and prevention/mitigation. This article appears in the special track on AI & Society.


Towards Explainable Fact Checking

arXiv.org Machine Learning

The past decade has seen a substantial rise in the amount of mis- and disinformation online, from targeted disinformation campaigns to influence politics, to the unintentional spreading of misinformation about public health. This development has spurred research in the area of automatic fact checking, from approaches to detect check-worthy claims and determining the stance of tweets towards claims, to methods to determine the veracity of claims given evidence documents. These automatic methods are often content-based, using natural language processing methods, which in turn utilise deep neural networks to learn higher-order features from text in order to make predictions. As deep neural networks are black-box models, their inner workings cannot be easily explained. At the same time, it is desirable to explain how they arrive at certain decisions, especially if they are to be used for decision making. While this has been known for some time, the issues this raises have been exacerbated by models increasing in size, and by EU legislation requiring models to be used for decision making to provide explanations, and, very recently, by legislation requiring online platforms operating in the EU to provide transparent reporting on their services. Despite this, current solutions for explainability are still lacking in the area of fact checking. This thesis presents my research on automatic fact checking, including claim check-worthiness detection, stance detection and veracity prediction. Its contributions go beyond fact checking, with the thesis proposing more general machine learning solutions for natural language processing in the area of learning with limited labelled data. Finally, the thesis presents some first solutions for explainable fact checking.


Towards Personalized and Human-in-the-Loop Document Summarization

arXiv.org Artificial Intelligence

The ubiquitous availability of computing devices and the widespread use of the internet have generated a large amount of data continuously. Therefore, the amount of available information on any given topic is far beyond humans' processing capacity to properly process, causing what is known as information overload. To efficiently cope with large amounts of information and generate content with significant value to users, we require identifying, merging and summarising information. Data summaries can help gather related information and collect it into a shorter format that enables answering complicated questions, gaining new insight and discovering conceptual boundaries. This thesis focuses on three main challenges to alleviate information overload using novel summarisation techniques. It further intends to facilitate the analysis of documents to support personalised information extraction. This thesis separates the research issues into four areas, covering (i) feature engineering in document summarisation, (ii) traditional static and inflexible summaries, (iii) traditional generic summarisation approaches, and (iv) the need for reference summaries. We propose novel approaches to tackle these challenges, by: i)enabling automatic intelligent feature engineering, ii) enabling flexible and interactive summarisation, iii) utilising intelligent and personalised summarisation approaches. The experimental results prove the efficiency of the proposed approaches compared to other state-of-the-art models. We further propose solutions to the information overload problem in different domains through summarisation, covering network traffic data, health data and business process data.


On the Opportunities and Risks of Foundation Models

arXiv.org Artificial Intelligence

AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles(e.g., model architectures, training procedures, data, systems, security, evaluation, theory) to their applications (e.g., law, healthcare, education) and societal impact (e.g., inequity, misuse, economic and environmental impact, legal and ethical considerations). Though foundation models are based on standard deep learning and transfer learning, their scale results in new emergent capabilities,and their effectiveness across so many tasks incentivizes homogenization. Homogenization provides powerful leverage but demands caution, as the defects of the foundation model are inherited by all the adapted models downstream. Despite the impending widespread deployment of foundation models, we currently lack a clear understanding of how they work, when they fail, and what they are even capable of due to their emergent properties. To tackle these questions, we believe much of the critical research on foundation models will require deep interdisciplinary collaboration commensurate with their fundamentally sociotechnical nature.


Set-to-Sequence Methods in Machine Learning: A Review

Journal of Artificial Intelligence Research

Machine learning on sets towards sequential output is an important and ubiquitous task, with applications ranging from language modelling and meta-learning to multi-agent strategy games and power grid optimization. Combining elements of representation learning and structured prediction, its two primary challenges include obtaining a meaningful, permutation invariant set representation and subsequently utilizing this representation to output a complex target permutation. This paper provides a comprehensive introduction to the field as well as an overview of important machine learning methods tackling both of these key challenges, with a detailed qualitative comparison of selected model architectures.


Inside Davis Technical College's First-Of-Its-Kind Inclusive CNC Machining Program Accessible To All

Forbes Europe

Marley Passey feels a part while working on the CNC machine. One of the most prevalent ableist tropes is the conception that, because a person is disabled a certain way, they can't do certain activities. A Blind person can't see so they must be unable to read books or watch television, for instance. That we typically use our eyes to read books or watch TV is a signal that, without the sense of sight, the ability to do these things is seemingly impossible. That's a problem: more than ableist, this sentiment shows a profound ignorance of accessibility and assistive technologies--which, given technology's capabilities nowadays, can feel downright insulting.


Logic Explained Networks

arXiv.org Artificial Intelligence

The large and still increasing popularity of deep learning clashes with a major limit of neural network architectures, that consists in their lack of capability in providing human-understandable motivations of their decisions. In situations in which the machine is expected to support the decision of human experts, providing a comprehensible explanation is a feature of crucial importance. The language used to communicate the explanations must be formal enough to be implementable in a machine and friendly enough to be understandable by a wide audience. In this paper, we propose a general approach to Explainable Artificial Intelligence in the case of neural architectures, showing how a mindful design of the networks leads to a family of interpretable deep learning models called Logic Explained Networks (LENs). LENs only require their inputs to be human-understandable predicates, and they provide explanations in terms of simple First-Order Logic (FOL) formulas involving such predicates. LENs are general enough to cover a large number of scenarios. Amongst them, we consider the case in which LENs are directly used as special classifiers with the capability of being explainable, or when they act as additional networks with the role of creating the conditions for making a black-box classifier explainable by FOL formulas. Despite supervised learning problems are mostly emphasized, we also show that LENs can learn and provide explanations in unsupervised learning settings. Experimental results on several datasets and tasks show that LENs may yield better classifications than established white-box models, such as decision trees and Bayesian rule lists, while providing more compact and meaningful explanations.


A Theory of Consciousness from a Theoretical Computer Science Perspective: Insights from the Conscious Turing Machine

arXiv.org Artificial Intelligence

The quest to understand consciousness, once the purview of philosophers and theologians, is now actively pursued by scientists of many stripes. We examine consciousness from the perspective of theoretical computer science (TCS), a branch of mathematics concerned with understanding the underlying principles of computation and complexity, including the implications and surprising consequences of resource limitations. In the spirit of Alan Turing's simple yet powerful definition of a computer, the Turing Machine (TM), and perspective of computational complexity theory, we formalize a modified version of the Global Workspace Theory (GWT) of consciousness originated by cognitive neuroscientist Bernard Baars and further developed by him, Stanislas Dehaene, Jean-Pierre Changeaux and others. We are not looking for a complex model of the brain nor of cognition, but for a simple computational model of (the admittedly complex concept of) consciousness. We do this by defining the Conscious Turing Machine (CTM), also called a conscious AI, and then we define consciousness and related notions in the CTM. While these are only mathematical (TCS) definitions, we suggest why the CTM has the feeling of consciousness. The TCS perspective provides a simple formal framework to employ tools from computational complexity theory and machine learning to help us understand consciousness and related concepts. Previously we explored high level explanations for the feelings of pain and pleasure in the CTM. Here we consider three examples related to vision (blindsight, inattentional blindness, and change blindness), followed by discussions of dreams, free will, and altered states of consciousness.


EmailSum: Abstractive Email Thread Summarization

arXiv.org Artificial Intelligence

Recent years have brought about an interest in the challenging task of summarizing conversation threads (meetings, online discussions, etc.). Such summaries help analysis of the long text to quickly catch up with the decisions made and thus improve our work or communication efficiency. To spur research in thread summarization, we have developed an abstractive Email Thread Summarization (EmailSum) dataset, which contains human-annotated short (<30 words) and long (<100 words) summaries of 2549 email threads (each containing 3 to 10 emails) over a wide variety of topics. We perform a comprehensive empirical study to explore different summarization techniques (including extractive and abstractive methods, single-document and hierarchical models, as well as transfer and semisupervised learning) and conduct human evaluations on both short and long summary generation tasks. Our results reveal the key challenges of current abstractive summarization models in this task, such as understanding the sender's intent and identifying the roles of sender and receiver. Furthermore, we find that widely used automatic evaluation metrics (ROUGE, BERTScore) are weakly correlated with human judgments on this email thread summarization task. Hence, we emphasize the importance of human evaluation and the development of better metrics by the community. Our code and summary data have been made available at: https://github.com/ZhangShiyue/EmailSum


Evaluating Relaxations of Logic for Neural Networks: A Comprehensive Study

arXiv.org Artificial Intelligence

Symbolic knowledge can provide crucial inductive bias for training neural models, especially in low data regimes. A successful strategy for incorporating such knowledge involves relaxing logical statements into sub-differentiable losses for optimization. In this paper, we study the question of how best to relax logical expressions that represent labeled examples and knowledge about a problem; we focus on sub-differentiable t-norm relaxations of logic. We present theoretical and empirical criteria for characterizing which relaxation would perform best in various scenarios. In our theoretical study driven by the goal of preserving tautologies, the Lukasiewicz t-norm performs best. However, in our empirical analysis on the text chunking and digit recognition tasks, the product t-norm achieves best predictive performance. We analyze this apparent discrepancy, and conclude with a list of best practices for defining loss functions via logic.